IBMSpeech → Text

IBM Releases 2B Granite Model for Multilingual Speech

The new two-billion-parameter model offers transcription capabilities for at least five major languages under a permissive Apache 2.0 license.

Apr 16, 2026

NotableApache 2.0

IBM has entered the open-source speech recognition arena with Granite Speech 4.1, a new two-billion-parameter model. Released under the permissive Apache 2.0 license, the model is designed for automatic speech recognition (ASR), also known as speech-to-text, and is available for developers to download and integrate freely.

This release provides a strong foundation for building multilingual voice applications. The model was trained to handle transcription for several languages, broadening its utility for global development teams.

Multilingual Capabilities

While details on the full training data are pending, the model explicitly supports high-quality transcription for at least five languages:

English
French
German
Italian
Spanish

The open availability of a capable ASR model from a major enterprise tech company like IBM is a notable development. It provides a commercially viable alternative to proprietary APIs and adds another powerful option alongside existing open-source models. Developers can access the full model weights and usage instructions on its Hugging Face repository.

Sources

ibm-granite/granite-speech-4.1-2b
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

KRAFTON releases A.X-K2 Raon speech MoE model

The game maker's new open model blends text-to-speech and speech recognition in a single 21B mixture-of-experts system with just 3B active parameters.

Jul 27, 2026

Microsoft/Speech → Text

Microsoft's VibeVoice ASR Goes BitNet for CPU Speech

A BitNet-quantized speech recognition model trades GPU dependence for efficient CPU inference in English and Chinese.

Jul 24, 2026

Nyralabs/Speech → Text

CrisperWhisper 2.0 Large targets verbatim transcription

A Whisper-based ASR model that keeps every filler word and stamps timestamps to the individual word, now covering English and German.

Jul 15, 2026

Multilingual Capabilities

While details on the full training data are pending, the model explicitly supports high-quality transcription for at least five languages:

English

French

German

Italian

Spanish