NVIDIA Releases 600M Parakeet for Speech Recognition
The new FastConformer model uses a specialized training technique to improve transcription accuracy in noisy, real-world environments.

NVIDIA has released a new open model for automatic speech recognition (ASR) called Parakeet TDT 0.6B. As part of its NeMo toolkit for conversational AI, this 600-million-parameter model is designed to transcribe speech across multiple languages with high accuracy.
The model's architecture and training method are key to its performance. It uses a FastConformer encoder, which is known for its efficiency in processing audio sequences. The "TDT" in its name signifies Transducer with Denoising Training, a technique that makes the model more robust by training it to ignore noise and focus on the primary speech signal, a common challenge in real-world applications.
This release provides developers with a powerful and relatively lightweight tool for building speech-enabled products. With a permissive CC-BY-4.0 license, Parakeet can be freely used and modified for both research and commercial projects. Its 0.6-billion-parameter size makes it more accessible to deploy than the massive, multi-billion-parameter systems that often dominate ASR research.
By open-sourcing a specialized model like Parakeet, NVIDIA is contributing a significant building block to the conversational AI ecosystem. Developers interested in experimenting with the model can find the weights and usage instructions on the official Hugging Face repository.
Sources
- Visit
nvidia/parakeet-tdt-0.6b-v3
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Speech → Text

Mega-ASR Improves on Qwen for Speech Recognition
Researcher Zhifei Xie has released a 1.7B-parameter model that refines Alibaba's Qwen3-ASR, showing improved performance on English and Chinese transcription benchmarks.

NVIDIA Releases Nemotron-3.5 Streaming ASR Model
The 600-million-parameter model uses a FastConformer architecture for real-time, multilingual speech-to-text applications.

Xiaomi Releases MiMo Model for Speech Recognition
The new open-source model from the Chinese tech giant offers automatic speech recognition for Mandarin, Cantonese, and English under a permissive MIT license.