The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestQwen · Alibaba3
Qwen · AlibabaText → Speech

Alibaba Releases CosyVoice 3 for Expressive TTS

The new 500-million-parameter text-to-speech model from the Qwen team offers multilingual voice cloning and emotional control.

Dec 11, 2025
NotableOther
Qwen · Alibaba · Text → Speech
Fun-CosyVoice3-0.5B
Fun-CosyVoice3-0.5B

Alibaba’s FunAudioLLM team, part of the group behind the Qwen model family, has released Fun-CosyVoice3, a 500-million-parameter foundation model for text-to-speech (TTS). The model is designed to generate highly natural, expressive, and controllable human-like speech, pushing the boundaries of open generative audio.

CosyVoice 3 stands out for its rich feature set, which brings it closer to capabilities offered by leading proprietary services. It provides a robust tool for developers working on sophisticated voice applications.

Cloning, Control, and Multilingual Support

The model's core strengths lie in its versatility and fine-grained control. Key features highlighted in the official release include:

  • Multilingual and Accent Support: CosyVoice 3 handles over ten languages, including English, Chinese, Japanese, French, and Spanish, and can manage code-switching between them.
  • Zero-Shot Voice Cloning: It can replicate a speaker’s voice from a mere 3-second audio clip, even performing cross-lingual cloning where the target language differs from the source clip.
  • Expressive Control: The model allows for adjustments to emotion, style, rhythm, and prosody, enabling the generation of nuanced and context-aware speech.

While the model is available for commercial use, it is released under the tongyi-qianwen-license-1.0, which carries restrictions. Companies with more than 100 million monthly active users must seek a separate license from Alibaba, a detail developers should note before integrating it into large-scale products.

Sources

  • FunAudioLLM/Fun-CosyVoice3-0.5B-2512

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters500M
Context window—
LicenseOTHER
Downloads39.2K

Modalities

Text → Speech

More in Text → Speech

Zyphra
Zonos 2
Zonos 2
Zyphra/Text → Speech

Zyphra Releases Open-Source Zonos 2 TTS Model

The new text-to-speech model offers a commercially permissive alternative for developers in a field still dominated by closed-source APIs.

Jun 11, 2026
Boson AI
Higgs Audio v3 TTS 4B
Higgs Audio v3 TTS 4B
Boson AI/Text → Speech

Boson AI's Higgs Audio v3 Offers Expressive, Multilingual TTS

The new 4-billion-parameter text-to-speech model is available for non-commercial use, promising fine-grained control over vocal delivery.

Jun 4, 2026
OpenMOSS
MOSS-TTS v1.5
MOSS-TTS v1.5
OpenMOSS/Text → Speech

MOSS-TTS Aims for More Robust Speech Synthesis

A new text-to-speech model introduces 'delay-pattern decoding' to solve common word skipping and repetition errors in parallel generation.

May 25, 2026