ZyphraText → Speech

Zyphra Releases Open-Source Zonos 2 TTS Model

The new text-to-speech model offers a commercially permissive alternative for developers in a field still dominated by closed-source APIs.

Jun 11, 2026

NotableApache 2.0

AI research company Zyphra has released Zonos 2, an open-weight model for text-to-speech (TTS) synthesis. The model is designed to generate human-like audio from text inputs, providing a foundational tool for applications requiring voice output.

The most significant aspect of the release is its license. Zonos 2 is available under the Apache 2.0 license, a permissive open-source license that allows for commercial use, modification, and distribution. This stands in contrast to the many high-quality TTS systems that are only accessible through proprietary, paid APIs, giving developers a new option for building and owning their voice generation stack.

While Zyphra has not yet published detailed technical specifications or performance benchmarks, the model and its weights are available for download on Hugging Face. This allows developers and researchers to immediately begin experimenting with the model and integrating it into their projects.

Zonos 2 represents a welcome expansion of open-source AI into modalities beyond text generation. As developers seek to build more complex, multi-modal applications, the availability of high-quality, permissively licensed components for audio, vision, and other senses will become increasingly crucial.

Sources

Zyphra/ZONOS2
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

Audio8 debuts a 0.6B multilingual zero-shot TTS preview

The compact text-to-speech model promises voice cloning across languages from a footprint small enough to run without heavy hardware.

Jul 28, 2026

KRAFTON/Any-to-Any

KRAFTON releases A.X-K2 Raon speech MoE model

The game maker's new open model blends text-to-speech and speech recognition in a single 21B mixture-of-experts system with just 3B active parameters.

Jul 27, 2026

NVIDIA/Any-to-Any

NVIDIA's Audex Unifies Audio Understanding and Speech

A new 30B mixture-of-experts model from NVIDIA handles both listening and speaking within a single audio-text architecture.

Jul 6, 2026