The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestQwen · AlibabaQwen3-TTS
Qwen · AlibabaText → Speech

Qwen Unveils Open Model for Custom Voice Synthesis

The new 1.7-billion-parameter text-to-speech model from Alibaba's Qwen team can generate novel voices from short audio prompts.

Jan 21, 2026
NotableApache 2.0
Qwen · Alibaba · Text → Speech
Qwen3-TTS-12Hz-1.7B-VoiceDesign
Qwen3-TTS-12Hz-1.7B-VoiceDesign

Alibaba's Qwen team has expanded its open-source offerings with Qwen3-TTS, a new model dedicated to high-quality speech synthesis. Released under a permissive Apache 2.0 license, this 1.7-billion-parameter system marks a significant entry into the growing field of open text-to-speech (TTS) models.

The model's standout feature is its "Voice Design" capability. Unlike traditional TTS systems that rely on a fixed set of pre-recorded voices, Qwen3-TTS can generate speech in a novel voice by analyzing a short audio prompt. This allows developers to create unique voices or clone existing ones for custom applications, a feature previously common in proprietary, API-driven systems.

Multilingual and Prompt-Driven

Qwen3-TTS is designed to be multilingual and is controlled through a combination of text and audio inputs. A user provides the text to be spoken along with a reference audio clip, and the model generates speech that matches the voice characteristics of the reference. The "12Hz" in the model's name likely refers to the sampling rate of its internal audio representation, a technique used in modern neural audio codecs to efficiently model speech.

The release of a powerful, commercially-permissive voice design model like Qwen3-TTS is a notable development for the open-source AI community. It provides a foundational tool for a wide range of applications, including personalized digital assistants, dynamic video game character dialogue, and accessibility tools, without the restrictions of closed platforms.

Sources

  • Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters1.7B
Context window—
LicenseAPACHE-2.0
Downloads566.9K

Modalities

Text → Speech

More in Text → Speech

Zyphra
Zonos 2
Zonos 2
Zyphra/Text → Speech

Zyphra Releases Open-Source Zonos 2 TTS Model

The new text-to-speech model offers a commercially permissive alternative for developers in a field still dominated by closed-source APIs.

Jun 11, 2026
Boson AI
Higgs Audio v3 TTS 4B
Higgs Audio v3 TTS 4B
Boson AI/Text → Speech

Boson AI's Higgs Audio v3 Offers Expressive, Multilingual TTS

The new 4-billion-parameter text-to-speech model is available for non-commercial use, promising fine-grained control over vocal delivery.

Jun 4, 2026
OpenMOSS
MOSS-TTS v1.5
MOSS-TTS v1.5
OpenMOSS/Text → Speech

MOSS-TTS Aims for More Robust Speech Synthesis

A new text-to-speech model introduces 'delay-pattern decoding' to solve common word skipping and repetition errors in parallel generation.

May 25, 2026