The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestQwen · Alibaba3
Qwen · AlibabaText → Speech

Qwen Releases Open-Source Voice Cloning Model

The new 600-million-parameter Qwen3-TTS model can generate speech in multiple languages and clone voices from short audio clips.

Jan 21, 2026
NotableApache 2.0
Qwen · Alibaba · Text → Speech
Qwen3-TTS 0.6B Base
Qwen3-TTS 0.6B Base

The Qwen team, part of Alibaba, has released a new open-source model for generating human-like speech. Named Qwen3-TTS, this initial release is a 600-million-parameter base model designed for text-to-speech (TTS) applications, making another powerful generative audio tool available to developers.

The model's key capabilities are its multilingual support and its capacity for voice cloning. This allows it to not only generate speech in various languages but also to mimic a specific person's voice using only a short audio sample as a reference. This feature, often called zero-shot voice cloning, is a significant capability for creating custom voice assistants, dynamic audio content, and accessibility tools.

A Permissive Foundation

Released under the permissive Apache 2.0 license, Qwen3-TTS provides a strong alternative to proprietary text-to-speech APIs. Its open nature encourages experimentation and allows developers to build upon it without restrictive licensing, fostering innovation in the open-source AI audio space.

The model, officially designated Qwen3-TTS-12Hz-0.6B-Base, is available now for download and use. As a "base" model, it serves as a solid foundation intended for further fine-tuning on specific tasks or voices to achieve higher quality and more specialized outputs.

Sources

  • Qwen/Qwen3-TTS-12Hz-0.6B-Base

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters600M
Context window—
LicenseAPACHE-2.0
Downloads648.1K

Modalities

Text → Speech

More in Text → Speech

Zyphra
Zonos 2
Zonos 2
Zyphra/Text → Speech

Zyphra Releases Open-Source Zonos 2 TTS Model

The new text-to-speech model offers a commercially permissive alternative for developers in a field still dominated by closed-source APIs.

Jun 11, 2026
Boson AI
Higgs Audio v3 TTS 4B
Higgs Audio v3 TTS 4B
Boson AI/Text → Speech

Boson AI's Higgs Audio v3 Offers Expressive, Multilingual TTS

The new 4-billion-parameter text-to-speech model is available for non-commercial use, promising fine-grained control over vocal delivery.

Jun 4, 2026
OpenMOSS
MOSS-TTS v1.5
MOSS-TTS v1.5
OpenMOSS/Text → Speech

MOSS-TTS Aims for More Robust Speech Synthesis

A new text-to-speech model introduces 'delay-pattern decoding' to solve common word skipping and repetition errors in parallel generation.

May 25, 2026