Google Releases 4B Multimodal Gemma 4 Assistant
The new 4-billion-parameter model is instruction-tuned for 'any-to-any' tasks, handling a flexible mix of data types.
Google DeepMind has expanded its open-source offerings with the release of Gemma 4 E4B-it Assistant, a new model in the Gemma family. At 4 billion parameters, this model is designed for efficiency while packing a significant new capability: advanced multimodality.
The model's standout feature is its 'any-to-any' architecture. Unlike models limited to text-and-image pairs, Gemma 4 is built to handle a more flexible combination of inputs and outputs. This design opens up possibilities for more integrated and versatile AI applications that can reason across different types of information simultaneously.
As an instruction-tuned ('it') variant, Gemma 4 is optimized out-of-the-box for conversational AI and assistant-like interactions. This fine-tuning makes it easier for developers to build responsive and helpful applications without extensive additional training.
By releasing a capable and efficient multimodal model under a commercially permissive Apache 2.0 license, Google is providing developers with a key building block for the next wave of AI applications. Researchers and engineers can explore the model's capabilities and access the full release details on the Hugging Face Hub. This move brings functionality often associated with larger, closed systems to the open-source community, enabling broader experimentation and innovation.
Sources
- Visit
google/gemma-4-E4B-it-assistant
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Any-to-Any

MiniMax Releases M3, a Multimodal MoE Model
The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.
Google Releases Gemma 4 12B Multimodal Model
The new 12-billion-parameter open model from DeepMind introduces a unified 'any-to-any' architecture for advanced multimodal tasks.
Google Releases Gemma 4, a 12B 'Any-to-Any' Model
The new 12-billion-parameter model from Google DeepMind is designed to handle a flexible mix of data types, moving beyond traditional text and image inputs.