Skip to content

notebook-nexus/chatterbox-tts-colab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

38 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

πŸŽ™οΈ Chatterbox TTS Colab - Easy Voice Cloning & Text-to-Speech

Open In Colab License: MIT Status: Active Python 3.8+ GitHub stars

πŸš€ One-click voice cloning and text-to-speech in Google Colab with Chatterbox TTS

Transform any text into natural-sounding speech, clone voices from audio samples, and create professional voiceovers - all running free in Google Colab!

πŸš€ Quick Start

  1. Click the "Open in Colab" button above
  2. Run all cell in the notebook
  3. Upload your voice sample (optional)
  4. Enter your text and generate speech!

✨ Features

  • 🎯 Zero Setup: Run immediately in Google Colab
  • πŸ—£οΈ Voice Cloning: Clone any voice from a short audio sample
  • πŸŽ›οΈ Advanced Controls: Fine-tune voice characteristics
  • πŸ’Ύ Google Drive Integration: Automatic saving to your drive
  • πŸ”§ Robust Error Handling: Graceful fallbacks and clear error messages

πŸ”Š Demo: Text & Audio Samples

Here’s a quick demo so you can seeβ€”and hearβ€”how Chatterbox-TTS-Colab performs.


πŸ“ Sample Text

β€œThis is a test of the Chatterbox TTS system. I hope this works properly now with the improved error handling and correct repository. The model should now load from ResembleAI/chatterbox instead of the old fluffyox repository.”


🎀 Original Voice Clip (for cloning)

cloned_voice.mov

πŸ€– AI-Generated TTS Output

generated_voice.mov

πŸŽ›οΈ Advanced Controls

Parameter Guide

Parameter Range Description Recommended Use
exaggeration 0.0-1.0 Controls emotional intensity and expressiveness 0.5 for natural speech, 0.7+ for dramatic
cfg 0.0-1.0 Classifier-free guidance for speech pacing 0.5 for normal, 0.3 for slower pacing
temperature 0.1-2.0 Controls randomness in generation 0.7 for balanced, 1.0+ for more variation
top_p 0.1-1.0 Nucleus sampling parameter 0.9 for most cases

Audio Quality Settings

# High quality (slower generation)
wav = model.generate(
    text,
    audio_prompt_path=AUDIO_PROMPT_PATH,
    exaggeration=0.5,
    cfg=0.5,
    temperature=0.7,
    top_p=0.9,
    steps=30 
)

# Fast generation (lower quality)
wav = model.generate(
    text,
    audio_prompt_path=AUDIO_PROMPT_PATH,
    steps=15  # Fewer steps = faster generation
)

For more detailed documentation, see our USAGE.md


🀝 Contributing

Please see our Contributing Guide for details.


πŸ™ Acknowledgments

  • Resemble AI for creating the incredible Chatterbox TTS model
  • Google Colab for providing free GPU access
  • Hugging Face for model hosting and distribution
  • PyTorch and Torchaudio for the underlying framework
  • The Open Source Community for continuous support and contributions

Special Thanks

  • Original Chatterbox TTS: resemble-ai/chatterbox
  • Resemble AI Team for open-sourcing this state-of-the-art model
  • Contributors who help maintain and improve this Colab implementation

πŸ“ž Support


πŸ”— Connect

πŸ“ Writing & Blogging

Hashnode Medium

πŸ’Ό Professional

Website ukr-projects cyberx-projects contro-projects LinkedIn Main Channel

🌐 Social

Twitter Instagram Tech Channel Telegram Reddit


Made with ❀️ by ukr