Kyutai Releases 1.6B Bilingual TTS Model
The French AI lab's new open-source model generates streaming audio in English and French under a permissive license.
French AI research lab Kyutai has released a new 1.6-billion-parameter text-to-speech (TTS) model that generates high-quality audio in both English and French. The model is available under a permissive Creative Commons license (CC-BY-4.0), allowing for broad use, including in commercial applications.
The system is built on what the team calls the Moshi stack, an adaptation of the VoiceCraft architecture. A key feature is its ability to support streaming audio generation, making it suitable for real-time applications where low latency is critical, such as conversational agents or live narration tools.
A New Voice in Open TTS
While the open-source landscape for large language models has exploded, the space for high-quality, permissively licensed TTS models is less crowded. Kyutai's release provides a significant new building block for developers creating voice-enabled products without relying on proprietary, closed-source APIs.
As a non-profit lab, Kyutai's contribution adds valuable diversity to an ecosystem often dominated by a few large tech companies. Developers can access the model weights, explore an interactive demo, and find usage examples on the official Hugging Face repository.
Sources
- Visit
kyutai/tts-1.6b-en_fr
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Text → Speech
Zyphra Releases Open-Source Zonos 2 TTS Model
The new text-to-speech model offers a commercially permissive alternative for developers in a field still dominated by closed-source APIs.

Boson AI's Higgs Audio v3 Offers Expressive, Multilingual TTS
The new 4-billion-parameter text-to-speech model is available for non-commercial use, promising fine-grained control over vocal delivery.
MOSS-TTS Aims for More Robust Speech Synthesis
A new text-to-speech model introduces 'delay-pattern decoding' to solve common word skipping and repetition errors in parallel generation.