This comprehensive tutorial guides you through voice generation and dubbing features, enabling you to create professional narration and multi-character dialogue for your videos.

1. What is Voice Generation?

Voice Generation (TTS - Text-to-Speech) transforms your script text into natural-sounding narration, bringing your stories to life with professional-quality audio.

1.1 When to Use Voice Generation

Voice generation is ideal for:

  • Video narration: Add voiceover to your story or video content
  • Multi-character dialogue: Assign unique voices to different characters
  • Quick iteration: Rapidly regenerate audio when updating scripts
  • Professional quality: Produce broadcast-quality narration without recording equipment

2. Two Ways to Generate Voice

You can generate voice in two locations, each suited for different workflows:

2.1 Editor Page

Quick voice generation with basic controls for rapid iteration.

2.2 Dubbing Page

Advanced voice editing with fine-grained control over sentences, pauses, and voice selection.

3. Quick Start: Generate Voice from Editor

The Editor provides fast access to voice generation for your entire project or selected shots.

Editor Voice Generation

3.1 How to Regenerate Voice in Editor

  1. Click Regenerate All Audio in the top header
  2. Choose narrator voice and character voices from dropdowns
  3. Preview voice samples by clicking the play button
  4. Click Regenerate to apply voices to all shots

3.2 Per-Shot Voice Regeneration

For quick A/B testing of different voices:

  1. Select a shot in the timeline
  2. Find the audio section in the right panel
  3. Click Regenerate Audio for that specific shot

Use case: Perfect for quickly testing different voice options without affecting the entire project.

4. Dubbing Page: Your Advanced Voice Studio

The Dubbing Page is your main workspace for detailed voice editing and multi-character dialogue management.

Dubbing Page Layout

4.1 Page Layout Overview

The Dubbing Page consists of two main areas:

Left Area

  • Narration List: Shows narration text for all shots—click to select
  • Timeline Controls: Playback controls and audio splitting timeline

Right Panel (Three Tabs)

  • Narration Settings: Adjust narrator voice and parameters
  • Character Voices: Assign voices to different characters
  • Lip Sync: Generate lip-synced videos

5. Understanding Roles and Voice Assignment

Roles help you organize voices for different characters and narrators, ensuring consistency throughout your project.

Role and Voice Assignment

5.1 How Roles Work

  • Narrator: Default voice for narration text
  • Characters: Assign different voices to speaking characters
  • Role-to-voice mapping: Bind a specific voice to each role
  • Global application: Set a voice for a role and apply to all matching shots

5.2 Assigning Voices to Roles

  1. Click the Character Voices tab in the right panel
  2. Select a role from the role list (Narrator or character name)
  3. Click the Select Voice button to open the voice library
  4. Choose a voice from the library
  5. The voice automatically applies to all shots for that role

Pro tip: Assign voices to all roles before generating audio to ensure consistency.

6. Browse and Select Voices

The voice library provides hundreds of voices to choose from, each with unique characteristics and styles.

Voice Library

6.1 Voice Library Features

Powerful Filtering:

  • By style: Narrative, dramatic, promotional, casual
  • By language: English, Chinese, Japanese, and more
  • By gender: Male, female, neutral

Preview & Favorites:

  • Preview voices: Click the play button to hear samples
  • Favorites: Star your favorite voices for quick access

6.2 How to Select a Voice

  1. In the Character Voices panel, click Select Voice
  2. In the voice library modal, use filters to narrow down options
  3. Click the play button on voice cards to preview
  4. Click Select to assign the voice to the current role

Selection tips:

  • Listen to multiple voices before deciding
  • Consider the character’s personality and age
  • Test voices with actual script text when possible

7. Multi-Character Voice Generation

Generate audio for all shots with one click, applying the appropriate voice to each character automatically.

7.1 How to Use Multi-Character Voice Generation

  1. Ensure all roles have assigned voices
  2. Click the Multi-Character Voice button above the timeline
  3. The system generates audio for all shots with narration text
  4. Wait for generation to complete (may take a few minutes)
  5. Preview all audio on the timeline when complete

Generation time: Depends on project length—typically 2-5 minutes for most projects.

8. Upload Custom Audio

Have your own recorded audio? Upload it directly and use timeline tools to split it across shots.

8.1 Upload Workflow

  1. Click the Upload Audio button
  2. Select your audio file (supports common formats: MP3, WAV, M4A)
  3. Audio is automatically added to the timeline
  4. Use timeline markers to split audio into different shots

Use cases:

  • Professional voice recordings
  • Pre-recorded narration
  • Audio from external sources

9. Timeline Controls and Audio Splitting

The timeline is the core feature of the Dubbing Page, providing precise control over audio playback and splitting.

9.1 Timeline Features

Playback Controls:

  • Play/Pause: Control audio playback
  • Jump buttons: Quickly navigate to previous or next shot
  • Playback speed: Adjust speed (0.5x - 2x)
  • Volume control: Adjust playback volume

Visual Timeline:

  • Shows audio segments for all shots
  • Displays waveforms for visual reference
  • Indicates shot boundaries

9.2 Audio Splitting Feature

Split a complete audio file into individual shots using timeline markers:

  1. Drag markers on the timeline to appropriate positions
  2. Markers automatically align to shot boundaries
  3. After adjusting, click the Apply Split button
  4. The system splits the complete audio by marker positions into each shot

Pro tip: Use the waveform visualization to identify natural break points in your audio.

10. Adjust Voice Parameters

Fine-tune voice generation with stability parameters to match your content style.

Voice Parameters

10.1 Stability Parameter

Click the stability button above the timeline to adjust voice stability:

Stability Levels:

  • High stability (1.0): More consistent and steady delivery
    • Best for: Professional narration, news, educational content
  • Medium stability (0.5): Balanced delivery
    • Best for: Storytelling, general videos
  • Low stability (0.0): More dynamic and expressive
    • Best for: Dramatic dialogue, emotional moments

10.2 When to Adjust Stability

Increase stability when:

  • Voice sounds too variable or inconsistent
  • You need professional, steady narration
  • Recording educational or instructional content

Decrease stability when:

  • Voice sounds too robotic or stiff
  • You want more emotional expression
  • Creating dramatic or theatrical content

11. Preview and Export

After generating audio, preview it with your video to ensure quality and timing.

Preview Audio

11.1 Preview Workflow

  1. Click a shot in the narration list
  2. Use timeline playback controls to listen to audio
  3. Observe audio waveforms and shot segments on the timeline
  4. Adjust voice or regenerate as needed

11.2 Export Tips

Quality Checklist:

  • All shots have generated audio (no missing audio)
  • No audio glitches or unnatural pauses
  • Voice matches character personality
  • Audio levels are consistent

Next Steps:

  • Preview the full video before final export
  • Return to the Editor page for final video editing
  • Export your completed video with professional audio

12. Common Workflow Example

Here’s a typical dubbing workflow from start to finish:

12.1 Complete Dubbing Workflow

  1. Prepare script: Write narration text for each shot in the Editor
  2. Enter Dubbing Page: Access from the project navigation menu
  3. Assign character voices: Select appropriate voices for narrator and characters
  4. Generate audio: Click “Multi-Character Voice” to generate for all shots
  5. Preview and adjust: Review audio and make adjustments as needed
  6. Adjust stability: Fine-tune parameters for specific shots if needed
  7. Apply lip sync: Generate lip sync for applicable shots (optional)
  8. Final check: Play through entire timeline to verify quality
  9. Return to Editor: Go back for final video editing
  10. Export video: Render final video with professional audio

13. Tips for Better Voice Quality

13.1 Voice Selection

  • Choose unique voices: Select distinct voices for different characters to avoid confusion
  • Maintain consistency: Keep the same voice for each character throughout
  • Match personality: Choose voices that fit character age, personality, and role

13.2 Technical Quality

  • Adjust stability: Fine-tune based on content type
  • Preview often: Listen frequently during editing to catch issues early
  • Use timeline visualization: Check if audio segments are reasonable

13.3 Workflow Efficiency

  • Assign all voices first: Complete voice assignment before generating
  • Batch generate: Use multi-character generation for efficiency
  • Save favorites: Star frequently used voices for quick access

14. Troubleshooting FAQ

14.1 Voice Sounds Unstable or Inconsistent

Solution:

  • Increase Stability parameter to 1.0
  • If too stiff, try a different voice with more natural variation
  • Check if the script has unusual formatting or characters

14.2 Audio Generation Failed

Solution:

  • Verify the shot has narration text
  • Ensure voices are assigned to all roles
  • Try regenerating audio for that specific shot
  • Check your internet connection

14.3 Characters Sound Too Similar

Solution:

  • Assign more distinct voices to each role
  • Choose voices with clear differences in gender, age, and style
  • Use the voice library filters to find contrasting voices

14.4 Lip Sync Results Not Ideal

Solution:

  • Ensure shot is single front-facing person with clear face
  • Verify audio quality is good
  • Try adjusting audio and regenerating lip sync
  • Check that the shot meets lip sync requirements

15. Lip Sync Feature

Lip sync allows you to match character mouth movements perfectly with audio, creating natural and realistic videos.

Lip Sync

15.1 What is Lip Sync?

Lip Sync is an AI technology that automatically adjusts mouth movements in your video to precisely match audio. This makes it look like the person is actually speaking the words.

15.2 Best Use Cases

Lip sync works best in these scenarios:

  • Single front-facing person: Shot contains only one person facing the camera
  • Clear facial features: Person’s face is clearly visible without obstructions
  • Dialogue scenes: Interviews, monologues, educational videos

15.3 Limitations

Important constraints:

  • Best for single front-facing person shots (multiple people or side profiles may not work well)
  • Requires both video and audio in the shot
  • Face must be clearly visible without obstructions
  • Works best with human faces (not animated characters)

15.4 How to Use Lip Sync

  1. In the Dubbing Page, select a shot with both video and audio
  2. Click the Lip Sync tab in the right panel
  3. Click the Match lip sync to video button
  4. Wait for AI processing (usually takes a few minutes)
  5. Preview the lip-synced video when processing completes
  6. Apply to final video if satisfied

15.5 Lip Sync Workflow

Typical workflow:

  1. Generate audio first: Use dubbing features to generate audio for the shot
  2. Check shot suitability: Ensure shot is single front-facing person with clear face
  3. Apply lip sync: Click generate in the Lip Sync panel
  4. Preview results: Review the lip-synced effect
  5. Fine-tune if needed: Adjust audio and regenerate if results aren’t ideal
  6. Export video: Export final video when satisfied

15.6 Tips and Tricks

  • Complete audio first: Ensure audio is generated and satisfactory before applying lip sync
  • Choose suitable shots: Not all shots are suitable—select clear front-facing shots for best results
  • Be patient: Lip sync requires AI processing and may take several minutes
  • Try multiple times: If first attempt isn’t ideal, adjust audio and regenerate

16. Summary and Next Steps

With these tools, you can create professional-quality narration from script to final video:

Key Takeaways:

  • Use Editor for quick voice generation
  • Use Dubbing Page for advanced multi-character dialogue
  • Assign voices to roles for consistency
  • Adjust stability parameters to match content style
  • Use lip sync for realistic dialogue scenes

Build Your Voice Library:

  • Save favorite voices for future projects
  • Document successful voice and parameter combinations
  • Create reusable templates for consistent results

Ready to create? Start experimenting with voice generation and discover how professional narration can transform your videos!