IBM Releases 2B Granite Model for Multilingual Speech
The new two-billion-parameter model offers transcription capabilities for at least five major languages under a permissive Apache 2.0 license.

IBM has entered the open-source speech recognition arena with Granite Speech 4.1, a new two-billion-parameter model. Released under the permissive Apache 2.0 license, the model is designed for automatic speech recognition (ASR), also known as speech-to-text, and is available for developers to download and integrate freely.
This release provides a strong foundation for building multilingual voice applications. The model was trained to handle transcription for several languages, broadening its utility for global development teams.
Multilingual Capabilities
While details on the full training data are pending, the model explicitly supports high-quality transcription for at least five languages:
- English
- French
- German
- Italian
- Spanish
The open availability of a capable ASR model from a major enterprise tech company like IBM is a notable development. It provides a commercially viable alternative to proprietary APIs and adds another powerful option alongside existing open-source models. Developers can access the full model weights and usage instructions on its Hugging Face repository.
Sources
- Visit
ibm-granite/granite-speech-4.1-2b
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Speech → Text

Mega-ASR Improves on Qwen for Speech Recognition
Researcher Zhifei Xie has released a 1.7B-parameter model that refines Alibaba's Qwen3-ASR, showing improved performance on English and Chinese transcription benchmarks.

NVIDIA Releases Nemotron-3.5 Streaming ASR Model
The 600-million-parameter model uses a FastConformer architecture for real-time, multilingual speech-to-text applications.

Xiaomi Releases MiMo Model for Speech Recognition
The new open-source model from the Chinese tech giant offers automatic speech recognition for Mandarin, Cantonese, and English under a permissive MIT license.