Veena TTS Model Targets Indian Languages with Llama Base
Maya Research has released a 3-billion-parameter model designed to generate natural-sounding speech in Hindi and English.
Maya Research has introduced Veena, a new open-source model for text-to-speech (TTS) synthesis. The model is specifically designed to address the need for high-quality voice generation in major Indian languages.
What sets Veena apart is its foundation on a Llama-style architecture. The 3-billion-parameter model is trained to generate natural-sounding speech from text in both Hindi and English, catering to the nuances of Indian accents and dialects. This architectural choice leverages the powerful text-processing capabilities of large language models for the distinct task of audio generation.
The release marks a significant step for open-source AI in a region where high-quality, accessible models have been less common. By focusing on widely spoken Indian languages, Veena could enable a new range of applications, from localized voice assistants and accessibility tools to automated content creation for one of the world's largest digital audiences.
The model, its weights, and usage instructions are available for download on the Hugging Face Hub. It is released under a custom license, and potential users should review the terms before implementation.
Sources
- Visit
maya-research/Veena
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Text → Speech
Zyphra Releases Open-Source Zonos 2 TTS Model
The new text-to-speech model offers a commercially permissive alternative for developers in a field still dominated by closed-source APIs.

Boson AI's Higgs Audio v3 Offers Expressive, Multilingual TTS
The new 4-billion-parameter text-to-speech model is available for non-commercial use, promising fine-grained control over vocal delivery.
MOSS-TTS Aims for More Robust Speech Synthesis
A new text-to-speech model introduces 'delay-pattern decoding' to solve common word skipping and repetition errors in parallel generation.