🎤 Voice Cloning Model Evaluation

✨ Overview

This repository provides a comprehensive evaluation of voice cloning models based on objective speech quality metrics. Our goal is to assess the effectiveness of these models in generating high-quality, intelligible, and natural-sounding voices.

📈 Models Evaluated

Model	PESQ	STOI	MCD	Pitch Corr	Spec Conv	Energy Ratio	SNR (dB)
OpenVoice	1.165	0.136	37.988	-0.027	3.475	12.305	-11.193
CoquiTTS	1.727	0.143	203.193	0.012	6.675	45.896	-16.717
F5-TTS	1.782	0.171	174.265	0.060	6.082	39.209	-16.065
E2-TTS	2.281	0.165	158.578	-0.051	5.760	34.939	-15.551

🔬 Evaluation Metrics

PESQ (Perceptual Evaluation of Speech Quality): Measures speech quality, with values ranging from -0.5 to 4.5 (higher is better).
STOI (Short-Time Objective Intelligibility): Assesses how well the synthesized voice is understood (range: 0 to 1, higher is better).
MCD (Mel Cepstral Distortion): Lower values indicate more accurate voice cloning.
Pitch Correlation: Measures how closely the pitch matches the original speaker (closer to 1 is better).
Spectral Convergence (Spec Conv): Evaluates how well spectral features align (lower is better).
Energy Ratio: Assesses energy distribution in frequency bands.
SNR (Signal-to-Noise Ratio in dB): Higher values indicate cleaner, more natural output.

🌐 Test the Models

You can experiment with each model using the provided links:

OpenVoice: Try Here
TortoiseTTS: Try Here
E2-F5-TTS: Try Here
CoquiTTS: Try Here

🌁 Summary & Recommendations

Best for natural voice quality: E2-TTS (highest PESQ, lowest MCD).
Best for intelligibility: F5-TTS (highest STOI score).
Moderate performance: CoquiTTS (balanced results but high spectral distortion).
Least recommended: OpenVoice (low PESQ, less realistic output).

🔗 References

📚 License

This project is licensed under the Apache 2.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
CoquiTTS.wav		CoquiTTS.wav
E2-TTS.wav		E2-TTS.wav
F5-TTS.wav		F5-TTS.wav
LICENSE		LICENSE
OpenVoice.wav		OpenVoice.wav
README.md		README.md
reference.wav		reference.wav
voice_cloning.ipynb		voice_cloning.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎤 Voice Cloning Model Evaluation

✨ Overview

📈 Models Evaluated

🔬 Evaluation Metrics

🌐 Test the Models

🌁 Summary & Recommendations

🔗 References

📚 License

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

License

build-ai-applications/voice-cloning

Folders and files

Latest commit

History

Repository files navigation

🎤 Voice Cloning Model Evaluation

✨ Overview

📈 Models Evaluated

🔬 Evaluation Metrics

🌐 Test the Models

🌁 Summary & Recommendations

🔗 References

📚 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages