Google DeepMindVision-Language

Google's MedGemma brings open vision AI to medicine

The new 4-billion-parameter vision-language model is specialized for tasks in radiology, pathology, and complex clinical reasoning.

Jan 7, 2026

NotableGemma

Google DeepMind has released MedGemma 1.5, a new open-weights model designed specifically for the medical domain. This 4-billion-parameter model extends the capabilities of the Gemma family into specialized areas, focusing on vision-language tasks that are critical for healthcare.

As a vision-language model (VLM), MedGemma is trained to understand and reason about both text and images. According to its official model card, its instruction tuning targets complex use cases like interpreting radiological scans, analyzing pathology slides, and aiding in multifaceted clinical reasoning.

The release marks a significant step for open-source AI in healthcare. By providing researchers and developers with a capable, specialized foundation model, MedGemma could accelerate the development of new diagnostic aids, medical education tools, and clinical support systems that were previously reliant on proprietary, closed-source models.

MedGemma 1.5 is available under the Gemma license, which permits commercial use and distribution subject to its terms. This particular version, medgemma-1.5-4b-it, is an instruction-tuned variant, making it ready for conversational and question-answering applications out of the box.

Sources

google/medgemma-1.5-4b-it
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

Thinking Machines Debuts Inkling Small, a Compact Multimodal MoE

The Apache-2.0 model brings mixture-of-experts efficiency to image, audio, and text tasks in a smaller footprint.

Jul 27, 2026

Microsoft/Vision-Language

Microsoft's Mage-VL Streams Video Natively

A codec-native multimodal foundation model aims to understand live video and vision-language input in real time.

Jul 26, 2026

Swiss Ai/Text / LLM

Apertus v1.5 70B arrives with an Apache-2.0 license

Switzerland's open-model effort ships a 70-billion-parameter, multilingual and multimodal system that anyone can use, modify, and deploy.

Jul 24, 2026