Google DeepMindAny-to-Any

Google Releases 4B Multimodal Gemma 4 Assistant

The new 4-billion-parameter model is instruction-tuned for 'any-to-any' tasks, handling a flexible mix of data types.

Apr 23, 2026

NotableApache 2.0

Google DeepMind has expanded its open-source offerings with the release of Gemma 4 E4B-it Assistant, a new model in the Gemma family. At 4 billion parameters, this model is designed for efficiency while packing a significant new capability: advanced multimodality.

The model's standout feature is its 'any-to-any' architecture. Unlike models limited to text-and-image pairs, Gemma 4 is built to handle a more flexible combination of inputs and outputs. This design opens up possibilities for more integrated and versatile AI applications that can reason across different types of information simultaneously.

As an instruction-tuned ('it') variant, Gemma 4 is optimized out-of-the-box for conversational AI and assistant-like interactions. This fine-tuning makes it easier for developers to build responsive and helpful applications without extensive additional training.

By releasing a capable and efficient multimodal model under a commercially permissive Apache 2.0 license, Google is providing developers with a key building block for the next wave of AI applications. Researchers and engineers can explore the model's capabilities and access the full release details on the Hugging Face Hub. This move brings functionality often associated with larger, closed systems to the open-source community, enabling broader experimentation and innovation.

Sources

google/gemma-4-E4B-it-assistant
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

Thinking Machines Debuts Inkling Small, a Compact Multimodal MoE

The Apache-2.0 model brings mixture-of-experts efficiency to image, audio, and text tasks in a smaller footprint.

Jul 27, 2026

KRAFTON/Any-to-Any

KRAFTON releases A.X-K2 Raon speech MoE model

The game maker's new open model blends text-to-speech and speech recognition in a single 21B mixture-of-experts system with just 3B active parameters.

Jul 27, 2026

Microsoft/Vision-Language

Microsoft's Mage-VL Streams Video Natively

A codec-native multimodal foundation model aims to understand live video and vision-language input in real time.

Jul 26, 2026

Google DeepMindAny-to-Any

Google Releases 4B Multimodal Gemma 4 Assistant

The new 4-billion-parameter model is instruction-tuned for 'any-to-any' tasks, handling a flexible mix of data types.

Apr 23, 2026

NotableApache 2.0

Sources

google/gemma-4-E4B-it-assistant
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

Thinking Machines Debuts Inkling Small, a Compact Multimodal MoE

The Apache-2.0 model brings mixture-of-experts efficiency to image, audio, and text tasks in a smaller footprint.

Jul 27, 2026

KRAFTON/Any-to-Any

KRAFTON releases A.X-K2 Raon speech MoE model

The game maker's new open model blends text-to-speech and speech recognition in a single 21B mixture-of-experts system with just 3B active parameters.

Jul 27, 2026

Microsoft/Vision-Language

Microsoft's Mage-VL Streams Video Natively

A codec-native multimodal foundation model aims to understand live video and vision-language input in real time.

Jul 26, 2026