The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestNVIDIAv1
NVIDIASpeech → Text

NVIDIA's Parakeet ASR Tackles Multi-Speaker Audio

The 600-million-parameter model offers real-time speech-to-text with speaker diarization, built on the efficient FastConformer architecture.

Oct 15, 2025
NotableOther
NVIDIA · Speech → Text
Multitalker Parakeet Streaming 0.6B
Multitalker Parakeet Streaming 0.6B

NVIDIA has released Multitalker Parakeet Streaming 0.6B, a new open model designed to transcribe conversations with multiple participants in real time. The 600-million-parameter model addresses a common challenge for automatic speech recognition (ASR) systems: accurately capturing dialogue when more than one person is speaking.

Real-Time Diarization

The model's key capability is speaker diarization—the process of determining "who spoke when." By integrating this directly into its architecture, Parakeet can attribute transcribed text to the correct speaker as the audio is being processed. This "streaming" functionality, built on the efficient FastConformer architecture, is essential for live applications where low latency is critical.

This approach is a notable step forward for creating more useful and accurate automated transcripts. Potential applications include:

  • Live transcription and captioning for meetings and events.
  • Analyzing multi-participant audio from call centers.
  • Creating searchable records of interviews or panel discussions.

Available now on Hugging Face, the Multitalker Parakeet Streaming 0.6B model is released under the NVIDIA Open Model License Agreement. While not a traditional permissive open-source license, it allows for broad access and use of the model's weights. Its relatively compact size could enable deployment in a wide range of on-device or cloud environments.

Sources

  • nvidia/multitalker-parakeet-streaming-0.6b-v1

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters600M
Context window—
LicenseOTHER
Downloads335

Modalities

Speech → Text

More in Speech → Text

zhifeixie
Mega-ASR
Mega-ASR
zhifeixie/Speech → Text

Mega-ASR Improves on Qwen for Speech Recognition

Researcher Zhifei Xie has released a 1.7B-parameter model that refines Alibaba's Qwen3-ASR, showing improved performance on English and Chinese transcription benchmarks.

May 19, 2026
NVIDIA
Nemotron 3.5 ASR Streaming 0.6B
Nemotron 3.5 ASR Streaming 0.6B
NVIDIA/Speech → Text

NVIDIA Releases Nemotron-3.5 Streaming ASR Model

The 600-million-parameter model uses a FastConformer architecture for real-time, multilingual speech-to-text applications.

May 15, 2026
Xiaomi
MiMo-V2.5-ASR
MiMo-V2.5-ASR
Xiaomi/Speech → Text

Xiaomi Releases MiMo Model for Speech Recognition

The new open-source model from the Chinese tech giant offers automatic speech recognition for Mandarin, Cantonese, and English under a permissive MIT license.

Apr 23, 2026