The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestQwen · AlibabaQwen3-TTS
Qwen · AlibabaText → Speech

Qwen Releases Open 1.7B Custom Voice Synthesis Model

Alibaba's Qwen team has released a new text-to-speech model capable of cloning voices from just a few seconds of audio.

Jan 21, 2026
NotableApache 2.0
Qwen · Alibaba · Text → Speech
Qwen3-TTS 1.7B CustomVoice
Qwen3-TTS 1.7B CustomVoice

The Qwen team at Alibaba has released Qwen3-TTS, a new open-source text-to-speech (TTS) model. At 1.7 billion parameters, this model is designed to generate high-quality speech from text and is available under the permissive Apache 2.0 license, allowing for commercial use.

The standout feature of the new model is its ability to perform custom voice cloning. According to the release documentation, developers can use a short audio clip, typically between 3 and 10 seconds long, as a reference to synthesize speech in that specific voice. This capability opens up a wide range of applications for personalized and dynamic audio content.

Technical Details

The model, named Qwen3-TTS-12Hz-1.7B-CustomVoice, operates on a two-stage process. First, a text-to-acoustic model generates an initial audio representation from the input text and a voice embedding derived from the reference audio. Then, a vocoder converts this representation into the final audio waveform. The "12Hz" in its name refers to its tokenization rate, a technical detail related to how it processes audio information.

This release adds a powerful new tool to the growing ecosystem of open-source generative audio. By providing a capable, permissively licensed voice cloning model, the Qwen team is enabling developers to build more sophisticated and personalized voice applications, from custom assistants to accessibility tools. The model and usage instructions are available on the Hugging Face Hub.

Sources

  • Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters1.7B
Context window—
LicenseAPACHE-2.0
Downloads1.6M

Modalities

Text → Speech

More in Text → Speech

Zyphra
Zonos 2
Zonos 2
Zyphra/Text → Speech

Zyphra Releases Open-Source Zonos 2 TTS Model

The new text-to-speech model offers a commercially permissive alternative for developers in a field still dominated by closed-source APIs.

Jun 11, 2026
Boson AI
Higgs Audio v3 TTS 4B
Higgs Audio v3 TTS 4B
Boson AI/Text → Speech

Boson AI's Higgs Audio v3 Offers Expressive, Multilingual TTS

The new 4-billion-parameter text-to-speech model is available for non-commercial use, promising fine-grained control over vocal delivery.

Jun 4, 2026
OpenMOSS
MOSS-TTS v1.5
MOSS-TTS v1.5
OpenMOSS/Text → Speech

MOSS-TTS Aims for More Robust Speech Synthesis

A new text-to-speech model introduces 'delay-pattern decoding' to solve common word skipping and repetition errors in parallel generation.

May 25, 2026