Resemble AI Releases Chatterbox Turbo for Open TTS
The new text-to-speech model focuses on performance and offers voice cloning capabilities for English under a permissive MIT license.
Resemble AI, a company specializing in synthetic voice technology, has released a new open-source model named Chatterbox Turbo. The model is designed for high-performance text-to-speech (TTS) generation in English, targeting developers who need fast and efficient voice output in their applications.
Beyond standard speech synthesis, Chatterbox Turbo includes voice cloning capabilities, allowing users to create speech in a specific target voice from a short audio sample. The entire project is available on Hugging Face under the permissive MIT license, encouraging wide use and modification in both academic and commercial projects.
This release adds another strong contender to the rapidly growing field of open-source speech generation. While proprietary APIs have dominated the high-quality TTS space, permissively licensed models like Chatterbox Turbo provide a crucial, self-hostable alternative for developers. This move from a commercial provider signals a broader trend of companies contributing foundational models back to the community.
Developers interested in experimenting with the model can find the necessary code and instructions on the official Hugging Face repository.
Sources
- Visit
ResembleAI/chatterbox-turbo
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Text → Speech
Zyphra Releases Open-Source Zonos 2 TTS Model
The new text-to-speech model offers a commercially permissive alternative for developers in a field still dominated by closed-source APIs.

Boson AI's Higgs Audio v3 Offers Expressive, Multilingual TTS
The new 4-billion-parameter text-to-speech model is available for non-commercial use, promising fine-grained control over vocal delivery.
MOSS-TTS Aims for More Robust Speech Synthesis
A new text-to-speech model introduces 'delay-pattern decoding' to solve common word skipping and repetition errors in parallel generation.