Soprano TTS Model Leverages Qwen3 Architecture
The new 80-million-parameter text-to-speech model adapts a powerful language model architecture for efficient, open-source audio generation.
A new open-source model for generating speech from text has been released by a group called OpenMOSS. Named Soprano-1.1-80M, the model is exceptionally compact at just 80 million parameters and is available under the permissive Apache 2.0 license.
What sets Soprano apart is its foundation. The model adapts the architecture of Qwen3, a family of models primarily known for large-scale text generation. Applying a modern large language model (LLM) architecture to the specialized task of text-to-speech (TTS) represents an increasingly common strategy for leveraging the power of these advanced designs across different modalities.
The model's small size is a significant advantage, making it suitable for developers who need to run speech synthesis on consumer-grade hardware or in resource-constrained environments. By combining this efficiency with an open license, Soprano lowers the barrier for integrating custom, high-quality voice generation into a wide range of applications, from accessibility tools to creative projects.
Soprano-1.1-80M is available for download and experimentation now from the Hugging Face Hub.
Sources
- Visit
ekwek/Soprano-1.1-80M
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Text → Speech
Zyphra Releases Open-Source Zonos 2 TTS Model
The new text-to-speech model offers a commercially permissive alternative for developers in a field still dominated by closed-source APIs.

Boson AI's Higgs Audio v3 Offers Expressive, Multilingual TTS
The new 4-billion-parameter text-to-speech model is available for non-commercial use, promising fine-grained control over vocal delivery.
MOSS-TTS Aims for More Robust Speech Synthesis
A new text-to-speech model introduces 'delay-pattern decoding' to solve common word skipping and repetition errors in parallel generation.