nineninesixText → Speech

Kani-TTS-2 Offers New Open-Source Voice Generation

An independent researcher has released a new English text-to-speech model under a permissive license, built on a modern generative foundation.

Feb 12, 2026

UpdateApache 2.0

A new, permissively licensed text-to-speech (TTS) model called Kani-TTS-2 has been released by the independent researcher known as 'nineninesix'. The English-language model is now available for download and use on the Hugging Face Hub, providing another option for developers seeking open-source voice generation tools.

Technically, Kani-TTS-2 is built on a Latent Flow Matching 2 (LFM2) backbone. This positions the model within a modern class of generative techniques that aim for efficient and high-quality synthesis. While specific performance benchmarks were not provided with the release, its foundation suggests a contemporary approach to audio generation.

Why It Matters

The release of Kani-TTS-2 contributes to a growing ecosystem of open audio models. With its Apache 2.0 license, developers can freely use, modify, and integrate the model into both research and commercial applications. This stands in contrast to the many proprietary, API-gated TTS services, offering greater flexibility and control for building voice-enabled products.

You can explore the model, listen to samples, and find implementation details in the official Kani-TTS-2 repository on Hugging Face.

Sources

nineninesix/kani-tts-2-en
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

Audio8 debuts a 0.6B multilingual zero-shot TTS preview

The compact text-to-speech model promises voice cloning across languages from a footprint small enough to run without heavy hardware.

Jul 28, 2026

KRAFTON/Any-to-Any

KRAFTON releases A.X-K2 Raon speech MoE model

The game maker's new open model blends text-to-speech and speech recognition in a single 21B mixture-of-experts system with just 3B active parameters.

Jul 27, 2026

NVIDIA/Any-to-Any

NVIDIA's Audex Unifies Audio Understanding and Speech

A new 30B mixture-of-experts model from NVIDIA handles both listening and speaking within a single audio-text architecture.

Jul 6, 2026

Why It Matters

You can explore the model, listen to samples, and find implementation details in the official Kani-TTS-2 repository on Hugging Face.