Qwen · AlibabaSpeech → Text

Qwen open-sources compact model for speech recognition

The new 600-million-parameter Qwen3-ASR model is designed for efficient, high-quality audio transcription under a permissive license.

Jan 28, 2026

NotableApache 2.0

Alibaba's Qwen team has released a new open-source model specialized for automatic speech recognition (ASR). The model, named Qwen3-ASR-0.6B, stands out for its compact size, with just 600 million parameters. This release continues Qwen's expansion beyond large language models into more specialized, efficient AI tools.

Designed for converting spoken language into text, the model's small footprint makes it a compelling option for applications where computational resources are a constraint. This could include on-device transcription, real-time voice assistants, or other edge computing scenarios that require low latency and minimal overhead.

A Versatile Tool for Developers

The choice of an Apache 2.0 license is a significant detail, as it permits developers to use and modify the model for commercial purposes with few restrictions. This decision lowers the barrier to entry for building sophisticated voice-enabled products.

By providing a capable yet lightweight ASR model, Qwen is offering a valuable alternative to larger, more resource-intensive systems. Developers can find the model and usage instructions on its Hugging Face repository.

Sources

Qwen/Qwen3-ASR-0.6B
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

KRAFTON releases A.X-K2 Raon speech MoE model

The game maker's new open model blends text-to-speech and speech recognition in a single 21B mixture-of-experts system with just 3B active parameters.

Jul 27, 2026

Microsoft/Speech → Text

Microsoft's VibeVoice ASR Goes BitNet for CPU Speech

A BitNet-quantized speech recognition model trades GPU dependence for efficient CPU inference in English and Chinese.

Jul 24, 2026

Nyralabs/Speech → Text

CrisperWhisper 2.0 Large targets verbatim transcription

A Whisper-based ASR model that keeps every filler word and stamps timestamps to the individual word, now covering English and German.

Jul 15, 2026