The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestOpenBMB2
OpenBMBText → Speech

OpenBMB Releases VoxCPM2 for Expressive TTS

The new diffusion-based model from the OpenBMB research group supports multilingual speech, emotional control, and zero-shot voice cloning.

Apr 3, 2026
NotableOther
OpenBMB · Text → Speech
VoxCPM2
VoxCPM2

The OpenBMB research community has released VoxCPM2, a powerful new open-source model for text-to-speech synthesis. Built on a modern diffusion-based architecture, the model aims to generate high-fidelity, expressive human speech in multiple languages.

Cloning and Control

VoxCPM2's standout feature is its ability to perform zero-shot voice cloning using just a 3-to-20 second audio sample of a target voice. This allows it to generate speech in a new voice without specific training. The model also offers fine-grained control over the output, with key capabilities including:

  • Cross-lingual synthesis: Generate speech in one language using a voice from another (e.g., speaking Chinese with an English speaker's vocal characteristics).
  • Emotional control: Adjust the emotional tone of the generated speech.
  • Multilingual support: Primarily trained on Chinese and English.

The model uses a two-stage cascaded diffusion process. The first stage converts text into a mel-spectrogram, an acoustic representation of the audio. A second-stage vocoder then converts this spectrogram into a final audio waveform, a technique known for producing high-quality results.

VoxCPM2 represents another significant step forward for open-source generative audio, providing capabilities that rival proprietary systems. It gives researchers and developers a powerful tool for creating custom voice applications. The model is available for download on the Hugging Face Hub, though users should note its custom "OpenBMB Model License" for any usage considerations.

Sources

  • openbmb/VoxCPM2

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters—
Context window—
LicenseOTHER
Downloads273.2K

Modalities

Text → Speech

More in Text → Speech

Zyphra
Zonos 2
Zonos 2
Zyphra/Text → Speech

Zyphra Releases Open-Source Zonos 2 TTS Model

The new text-to-speech model offers a commercially permissive alternative for developers in a field still dominated by closed-source APIs.

Jun 11, 2026
Boson AI
Higgs Audio v3 TTS 4B
Higgs Audio v3 TTS 4B
Boson AI/Text → Speech

Boson AI's Higgs Audio v3 Offers Expressive, Multilingual TTS

The new 4-billion-parameter text-to-speech model is available for non-commercial use, promising fine-grained control over vocal delivery.

Jun 4, 2026
OpenMOSS
MOSS-TTS v1.5
MOSS-TTS v1.5
OpenMOSS/Text → Speech

MOSS-TTS Aims for More Robust Speech Synthesis

A new text-to-speech model introduces 'delay-pattern decoding' to solve common word skipping and repetition errors in parallel generation.

May 25, 2026