Google Releases Gemma 4 12B Multimodal Model
The new 12-billion-parameter open model from DeepMind introduces a unified 'any-to-any' architecture for advanced multimodal tasks.
Google DeepMind has released Gemma 4 12B, a new generation of its open model family. This 12-billion-parameter model is available under a permissive Apache 2.0 license, continuing Google's commitment to providing powerful tools for the open-source AI community.
Unlike many existing vision-language models, Gemma 4 is built on what Google calls a "unified any-to-any" architecture. This design aims to natively handle a wide variety of data modalities for both input and output, moving beyond the common text-and-image limitations of previous systems.
Why 'Any-to-Any' Matters
This architectural approach is significant for developers. It simplifies the process of building complex applications that need to interpret and generate combinations of different data types, such as text, images, and potentially other formats in the future. Instead of chaining together multiple specialized models, developers can use a single, more integrated system, which could enable more fluid and capable AI assistants, creative tools, and analysis engines.
By releasing a model with this advanced multimodal design, Google provides a powerful new foundation for open-source development. Researchers and engineers can now experiment with and build upon this flexible architecture, pushing the boundaries of what's possible with open AI. The model is available now on Hugging Face.
Sources
- Visit
google/gemma-4-12B
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Any-to-Any

MiniMax Releases M3, a Multimodal MoE Model
The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.
Google Releases Gemma 4, a 12B 'Any-to-Any' Model
The new 12-billion-parameter model from Google DeepMind is designed to handle a flexible mix of data types, moving beyond traditional text and image inputs.

ByteDance Releases Lance, a Unified Generative AI Model
The 3-billion-parameter model handles image and video generation, editing, and understanding from a single set of weights under a permissive license.