AI video generation technology is evolving rapidly. New models are added to ZeroTwo regularly. Check the model dropdown in the video Studio for the current full list of available models.
About video generation models
Video generation models are fundamentally different from text or image models. They’re specialized for:- Temporal consistency: ensuring subjects, lighting, and scene elements remain coherent across frames
- Realistic motion: generating natural-looking movement for people, objects, and environments
- Prompt adherence: translating text descriptions into specific visual action and camera behavior
Model capabilities overview
| Capability | Notes |
|---|---|
| Output length | Typically 2–10 seconds per clip (model-dependent) |
| Aspect ratios | Landscape (16:9), Portrait (9:16), Square (1:1) |
| Output formats | MP4, WebM, MOV |
| Plan requirement | Pro+ for all video models |
| Generation time | 30 seconds to 5+ minutes depending on model and length |
Choosing a model
Different models have different strengths. General guidance:| If you need… | Look for… |
|---|---|
| Realistic human motion | Models marketed for realistic/cinematic output |
| Animated or stylized video | Models with style or animation emphasis |
| Fastest generation | Lighter/faster model variants |
| Longest clip length | Check model-specific max duration in the model dropdown |
| Best prompt adherence | Try multiple models — adherence varies significantly |
Working with short clips
AI video models produce the most consistent results with short clips — typically 2–5 seconds. Short clips:- Complete faster
- Have better subject and motion consistency
- Are easier to iterate on
Prompt strategies for different model types
Different model strengths call for different prompting approaches: For realistic/cinematic models: Focus on naturalistic descriptions — real-world settings, human subjects, natural lighting, and grounded action. These models respond well to photographic terminology: “shallow depth of field”, “natural lighting”, “handheld documentary style”. Example:A woman in her 30s walking through a busy farmers market, warm morning light, handheld camera, documentary style, slow motion
For animated or stylized models:
Lean into stylistic descriptions — art styles, color palettes, and animation characteristics. Reference genres or aesthetic movements: “Studio Ghibli style”, “cel-shaded animation”, “retro anime aesthetic”.
Example: An animated fox running through an autumn forest, Studio Ghibli style, falling leaves, warm amber and gold color palette, smooth flowing motion
For all models:
- Keep scenes focused on one or two subjects
- Describe camera movement explicitly
- Specify time of day and lighting conditions
- Mention mood and visual tone
Generation workflow tips
Before you generate
- Write your prompt in full before opening the model dropdown
- Re-read it and ask: is the subject clear? Is the action specific? Is the camera behavior described?
- Select the model you want to test first
- Generate a short clip (2–4 seconds) to validate the direction
After generation
- If motion is wrong but subject is right: refine the action description, keep the model
- If subject is wrong: revise the subject description and try again
- If both are off: try a different model with the same prompt to see if model selection is the issue
- If quality is generally low: try a different model known for higher-quality output
Building longer sequences
For content longer than 10 seconds:- Break the story into distinct scenes (each 2–5 seconds)
- Generate each scene as a separate clip
- Combine in a video editor (CapCut, DaVinci Resolve, Final Cut Pro, Premiere)
- This approach also lets you replace any individual clip that didn’t generate well
Model updates
ZeroTwo’s video model library is updated as new and improved models become available from AI providers. The model dropdown in the video Studio always reflects the current available selection. Check the ZeroTwo changelog for announcements about new video model additions.Related
Creating videos
Full guide to video prompts and the generation workflow.
Supported formats
MP4, WebM, and MOV — which to choose for your use case.
Video troubleshooting
Common issues and fixes for video generation.
Image models
Compare image generation models for still-image creative work.

