Skip to main content
This guide walks through the complete process for generating audio in ZeroTwo Studio, with prompt examples for different audio types.
Audio generation is primarily available on Pro+ plans. If audio generation is not accessible, upgrade in Settings → Account.

Step-by-step: generate audio

1

Navigate to audio Studio

Click Audio in the topbar or go to /studio/audio.
2

Describe the audio you want

Write a prompt describing your desired audio. Be specific about: genre, mood, tempo, instruments, duration, and how it will be used.
3

Select a model

Choose an audio generation model from the model dropdown. Different models handle music, voice synthesis, and sound effects differently — select the one that matches your use case. See audio models for guidance.
4

Set duration (if applicable)

If the selected model supports duration control, specify how long the audio should be. Include this in your prompt as well: "Create a 45-second background track...".
5

Click Generate

Click the Generate button. Audio generation typically takes a few seconds to about a minute depending on the model and duration requested.
6

Preview and download

When generation completes, an audio player appears in the gallery. Click play to preview in-browser. Use the download icon to save the file to your device.

Prompt examples by audio type

Background music for video

Background music works best when you describe the genre, mood, tempo, and intended emotional effect:
  • Upbeat jazz piano loop for a 30-second product video, bright and energetic, 120 BPM
  • Calm cinematic background track for a travel documentary, orchestral strings, 60 seconds, gentle swell
  • Uplifting corporate presentation music, modern, professional, subtle piano and light percussion, 90 seconds
  • Tense thriller-style underscore, low strings, minimal, building tension, 45 seconds
  • Warm acoustic guitar loop, coffeeshop ambience, relaxed and friendly, 60 seconds

Ambient soundscapes

  • Calm ambient soundscape with rain and distant thunder, indoor atmosphere, no music
  • Forest environment: birds chirping, wind through leaves, distant stream, peaceful morning
  • Busy city street ambience: traffic, distant conversations, city energy, urban daytime
  • Underwater ambience: gentle bubbling, muffled resonance, serene and spacious

Intros and jingles

  • Short 5-second intro jingle for a tech podcast, modern, professional, upbeat
  • 10-second outro music with fade, light and friendly, optimistic
  • 3-second notification sound: positive, clear, modern UI style
  • Brand jingle: catchy, memorable, 8 seconds, fun and approachable

AI narration / text-to-speech

  • Read the following text in a warm, professional male voice at a medium pace: [text]
  • Narrate in an enthusiastic, energetic tone suitable for a product demo: [text]
  • Calm, soothing female voice for a meditation guide introduction: [text]
  • News anchor style, clear and authoritative: [text]

Sound effects

  • Single camera shutter click, professional DSLR
  • Typing on a mechanical keyboard, brief burst, 2 seconds
  • Coin drop on hard floor, metallic ring
  • Positive success chime, UI sound, cheerful
  • Door opening and closing, wooden door, interior

Output formats

Audio is available for download in standard formats depending on the selected model:
FormatBest for
MP3Universal sharing, web, social media, mobile
WAVProfessional audio workflows, lossless quality, video editing
Other formatsModel-dependent — check the download options
MP3 is recommended for most uses. Use WAV if you need lossless audio for professional production work.

Tips for better audio

  • Specify duration explicitly: "Create a 60-second..." or "a short 5-second..." helps the model target the right length
  • Describe the intended use: telling the model what the audio is for (“background music for a product video”, “notification sound for a mobile app”) helps it understand context and emotional register
  • Name instruments specifically: “piano and strings” is better than “music”; “acoustic guitar” is better than “guitar”
  • Describe tempo in BPM or adjective: "120 BPM", "slow and deliberate", "energetic and fast-paced"
  • Mention what to avoid: "no lyrics", "no drums", "instrumental only" helps prevent unwanted elements
For music intended to loop (like background tracks for a website or app), add “seamless loop” or “designed to loop” to your prompt. Some models are optimized for looping audio and will create tracks with smooth loop points.