Video Generation Models

ZeroTwo offers multiple AI video generation models, each with different strengths in motion quality, stylistic output, clip length, and prompt adherence. This page gives an overview of the available models and guidance on choosing between them.

AI video generation technology is evolving rapidly. New models are added to ZeroTwo regularly. Check the model dropdown in the video Studio for the current full list of available models.

About video generation models

Video generation models are fundamentally different from text or image models. They’re specialized for:

Temporal consistency: ensuring subjects, lighting, and scene elements remain coherent across frames
Realistic motion: generating natural-looking movement for people, objects, and environments
Prompt adherence: translating text descriptions into specific visual action and camera behavior

This is a significantly harder problem than generating a single image, which is why AI video quality is still evolving and generation times are longer.

Model capabilities overview

Capability	Notes
Output length	Typically 2–10 seconds per clip (model-dependent)
Aspect ratios	Landscape (16:9), Portrait (9:16), Square (1:1)
Output formats	MP4, WebM, MOV
Plan requirement	Pro+ for all video models
Generation time	30 seconds to 5+ minutes depending on model and length

Choosing a model

Different models have different strengths. General guidance:

If you need…	Look for…
Realistic human motion	Models marketed for realistic/cinematic output
Animated or stylized video	Models with style or animation emphasis
Fastest generation	Lighter/faster model variants
Longest clip length	Check model-specific max duration in the model dropdown
Best prompt adherence	Try multiple models — adherence varies significantly

Because video generation takes minutes and uses more compute than images, it’s worth spending time on a strong prompt before generating. Review the prompt tips in the create videos guide before your first generation.

Working with short clips

AI video models produce the most consistent results with short clips — typically 2–5 seconds. Short clips:

Complete faster
Have better subject and motion consistency
Are easier to iterate on

For longer videos, generate multiple short clips and combine them in a video editing tool. This gives you more control over pacing and lets you replace any segment that didn’t generate well.

Prompt strategies for different model types

Different model strengths call for different prompting approaches: For realistic/cinematic models: Focus on naturalistic descriptions — real-world settings, human subjects, natural lighting, and grounded action. These models respond well to photographic terminology: “shallow depth of field”, “natural lighting”, “handheld documentary style”. Example:

A woman in her 30s walking through a busy farmers market, warm morning light, handheld camera, documentary style, slow motion

For animated or stylized models: Lean into stylistic descriptions — art styles, color palettes, and animation characteristics. Reference genres or aesthetic movements: “Studio Ghibli style”, “cel-shaded animation”, “retro anime aesthetic”. Example:

An animated fox running through an autumn forest, Studio Ghibli style, falling leaves, warm amber and gold color palette, smooth flowing motion

For all models:

Keep scenes focused on one or two subjects
Describe camera movement explicitly
Specify time of day and lighting conditions
Mention mood and visual tone

Generation workflow tips

Before you generate

Write your prompt in full before opening the model dropdown
Re-read it and ask: is the subject clear? Is the action specific? Is the camera behavior described?
Select the model you want to test first
Generate a short clip (2–4 seconds) to validate the direction

After generation

If motion is wrong but subject is right: refine the action description, keep the model
If subject is wrong: revise the subject description and try again
If both are off: try a different model with the same prompt to see if model selection is the issue
If quality is generally low: try a different model known for higher-quality output

Building longer sequences

For content longer than 10 seconds:

Break the story into distinct scenes (each 2–5 seconds)
Generate each scene as a separate clip
Combine in a video editor (CapCut, DaVinci Resolve, Final Cut Pro, Premiere)
This approach also lets you replace any individual clip that didn’t generate well

Model updates

ZeroTwo’s video model library is updated as new and improved models become available from AI providers. The model dropdown in the video Studio always reflects the current available selection. Check the ZeroTwo changelog for announcements about new video model additions.

Creating videos

Full guide to video prompts and the generation workflow.

Supported formats

MP4, WebM, and MOV — which to choose for your use case.

Video troubleshooting

Common issues and fixes for video generation.

Image models

Compare image generation models for still-image creative work.

Getting Started

Overview

Core Chat

Tools

Studio

Models & Providers

Projects

Custom Agents

Skills

Connectors & Integrations

Personalization & Memory

Sharing

Workspaces & Business

Account & Billing

Privacy

Prompts

Troubleshooting

FAQ

Changelog

Reference

Video Generation Models

About video generation models

Model capabilities overview

Choosing a model

Working with short clips

Prompt strategies for different model types

Generation workflow tips

Before you generate

After generation

Building longer sequences

Model updates

Creating videos

Supported formats

Video troubleshooting

Image models

Getting Started

Overview

Core Chat

Tools

Studio

Models & Providers

Projects

Custom Agents

Skills

Connectors & Integrations

Personalization & Memory

Sharing

Workspaces & Business

Account & Billing

Privacy

Prompts

Troubleshooting

FAQ

Changelog

Reference

Documentation Index

​About video generation models

​Model capabilities overview

​Choosing a model

​Working with short clips

​Prompt strategies for different model types

​Generation workflow tips

​Before you generate

​After generation

​Building longer sequences

​Model updates

​Related

Creating videos

Supported formats

Video troubleshooting

Image models

About video generation models

Model capabilities overview

Choosing a model

Working with short clips

Prompt strategies for different model types

Generation workflow tips

Before you generate

After generation

Building longer sequences

Model updates

Related