The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestNVIDIAv1
NVIDIASpeech → Text

NVIDIA Releases Streaming Speech-to-Text Model

The 600-million-parameter Nemotron model is designed for real-time English transcription using a cache-aware FastConformer architecture.

Dec 17, 2025
NotableOther
NVIDIA · Speech → Text
Nemotron Speech Streaming EN 0.6B
Nemotron Speech Streaming EN 0.6B

NVIDIA has released a new model for automatic speech recognition (ASR), Nemotron Speech Streaming EN 0.6B. This 600-million-parameter model is specifically engineered for real-time, streaming transcription of English audio, making it suitable for applications that require immediate output.

Built for Real-Time Performance

The model is based on the FastConformer architecture, an effective design for speech recognition. Its key feature is its "cache-aware streaming" capability, which allows it to process audio in small chunks as it arrives rather than waiting for an entire recording. By intelligently managing its internal state, or cache, between these chunks, the model can deliver continuous transcription with minimal delay.

This streaming approach is critical for interactive voice applications. Potential use cases include:

  • Live captioning for broadcasts and events
  • Responsive voice assistants
  • Real-time transcription for meetings or customer service calls

By releasing a specialized model for this task, NVIDIA provides developers with another tool for building responsive, voice-enabled products. The model is available on Hugging Face under the NVIDIA Open Model License Agreement, and interested users can find full details in the official repository.

Sources

  • nvidia/nemotron-speech-streaming-en-0.6b

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters600M
Context window—
LicenseOTHER
Downloads6.2K

Modalities

Speech → Text

More in Speech → Text

zhifeixie
Mega-ASR
Mega-ASR
zhifeixie/Speech → Text

Mega-ASR Improves on Qwen for Speech Recognition

Researcher Zhifei Xie has released a 1.7B-parameter model that refines Alibaba's Qwen3-ASR, showing improved performance on English and Chinese transcription benchmarks.

May 19, 2026
NVIDIA
Nemotron 3.5 ASR Streaming 0.6B
Nemotron 3.5 ASR Streaming 0.6B
NVIDIA/Speech → Text

NVIDIA Releases Nemotron-3.5 Streaming ASR Model

The 600-million-parameter model uses a FastConformer architecture for real-time, multilingual speech-to-text applications.

May 15, 2026
Xiaomi
MiMo-V2.5-ASR
MiMo-V2.5-ASR
Xiaomi/Speech → Text

Xiaomi Releases MiMo Model for Speech Recognition

The new open-source model from the Chinese tech giant offers automatic speech recognition for Mandarin, Cantonese, and English under a permissive MIT license.

Apr 23, 2026