The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestNVIDIAv2
NVIDIASpeech → Text

NVIDIA Releases Canary 1B v2 Multilingual Speech Model

The new 1-billion-parameter model handles both transcription and translation across five languages using the company's efficient FastConformer architecture.

Aug 4, 2025
NotableCC BY 4.0
NVIDIA · Speech → Text
Canary 1B v2
Canary 1B v2

NVIDIA has released Canary 1B v2, a versatile 1-billion-parameter model for automatic speech recognition (ASR) and translation. Published with a permissive CC-BY-4.0 license, the model provides developers with a powerful new tool for building voice-enabled applications.

The model is built on NVIDIA's FastConformer architecture, which is designed for high-performance and efficient speech processing. Canary excels at multilingual tasks, handling both transcription in a source language and translation from that language into English within a single model.

Core Capabilities

According to its official release card, Canary 1B v2 was trained to handle several key tasks without the need for separate models:

  • Transcription: Supports English, German, French, Spanish, and Mandarin.
  • Translation: Can translate any of the supported source languages into English text.
  • Formatting: Includes automatic punctuation and capitalization to produce more readable output.

This release adds another high-quality, open-source option to a field largely defined by models like OpenAI's Whisper. By providing a permissively licensed and efficient alternative, NVIDIA gives developers more flexibility for integrating advanced speech AI. The model and its usage instructions are available on NVIDIA's Hugging Face page.

Sources

  • nvidia/canary-1b-v2

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters1B
Context window—
LicenseCC-BY-4.0
Downloads79.1K

Modalities

Speech → Text

More in Speech → Text

zhifeixie
Mega-ASR
Mega-ASR
zhifeixie/Speech → Text

Mega-ASR Improves on Qwen for Speech Recognition

Researcher Zhifei Xie has released a 1.7B-parameter model that refines Alibaba's Qwen3-ASR, showing improved performance on English and Chinese transcription benchmarks.

May 19, 2026
NVIDIA
Nemotron 3.5 ASR Streaming 0.6B
Nemotron 3.5 ASR Streaming 0.6B
NVIDIA/Speech → Text

NVIDIA Releases Nemotron-3.5 Streaming ASR Model

The 600-million-parameter model uses a FastConformer architecture for real-time, multilingual speech-to-text applications.

May 15, 2026
Xiaomi
MiMo-V2.5-ASR
MiMo-V2.5-ASR
Xiaomi/Speech → Text

Xiaomi Releases MiMo Model for Speech Recognition

The new open-source model from the Chinese tech giant offers automatic speech recognition for Mandarin, Cantonese, and English under a permissive MIT license.

Apr 23, 2026