nineninesixText → Speech

Kani TTS 370M Offers Compact Multilingual Speech

Based on the Language-Free Modeling for Multilingual Text-To-Speech (LFM2) architecture, the new model offers an efficient solution for developers.

Sep 30, 2025

UpdateOther

A new, efficient text-to-speech model called Kani TTS 370M has been released on the Hugging Face Hub. Developed by the user nineninesix, the model contains 370 million parameters, offering a relatively lightweight option for generating high-quality, multilingual speech.

The model is based on the Language-Free Modeling for Multilingual Text-To-Speech (LFM2) architecture. This approach allows it to handle multiple languages without relying on explicit language identification tags during training or inference. This design choice can make a model more flexible and scalable for diverse linguistic applications, learning to synthesize different languages from a mixed dataset.

Kani TTS 370M's compact size is its most notable feature. In a field often dominated by multi-billion parameter models, a smaller footprint makes it more accessible for researchers and developers with limited computational resources. This could enable its use in on-device applications or lower-cost cloud deployments where efficiency is a primary concern.

The model weights and usage instructions are publicly available on its Hugging Face repository. While the weights are accessible, the license is listed as "All rights reserved," indicating that it is not intended for commercial use without permission from the creator.

Sources

nineninesix/kani-tts-370m
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

Audio8 debuts a 0.6B multilingual zero-shot TTS preview

The compact text-to-speech model promises voice cloning across languages from a footprint small enough to run without heavy hardware.

Jul 28, 2026

KRAFTON/Any-to-Any

KRAFTON releases A.X-K2 Raon speech MoE model

The game maker's new open model blends text-to-speech and speech recognition in a single 21B mixture-of-experts system with just 3B active parameters.

Jul 27, 2026

NVIDIA/Any-to-Any

NVIDIA's Audex Unifies Audio Understanding and Speech

A new 30B mixture-of-experts model from NVIDIA handles both listening and speaking within a single audio-text architecture.

Jul 6, 2026