The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestFlashLabs1.0
FlashLabsAny-to-Any

FlashLabs Releases Chroma-4B, an Any-to-Any Model

The new 4-billion-parameter model handles text, image, and speech inputs and outputs, including direct speech-to-speech translation.

Nov 28, 2025
UpdateApache 2.0
FlashLabs · Any-to-Any
Chroma-4B
Chroma-4B

AI research group FlashLabs has released Chroma-4B, a new multimodal model designed for true “any-to-any” capabilities. The 4-billion-parameter model is available under an Apache 2.0 license, making it accessible for both research and commercial applications.

Unlike many multimodal models that are limited to text and image processing, Chroma-4B can understand and generate content across text, images, and audio streams simultaneously. This allows for novel use cases that have been challenging for previous open-source models.

A More Flexible Multimodal Architecture

The model's key feature is its ability to handle complex input and output combinations. According to the release documentation, Chroma-4B supports tasks such as:

  • Direct speech-to-speech translation
  • Generating an audio description from an image
  • Answering text-based questions about an audio clip

This versatility stems from a unified architecture that processes all modalities within a single framework, rather than relying on separate, specialized components.

While at 4 billion parameters Chroma-4B is smaller than many flagship models, its release marks an interesting step forward for open, natively multi-sensory AI. By moving beyond the common text-vision paradigm, it provides a foundation for developing more integrated and intuitive applications. The model and its weights are available on Hugging Face.

Sources

  • FlashLabs/Chroma-4B

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters4B
Context window—
LicenseAPACHE-2.0
Downloads54

Modalities

Any-to-Any

More in Any-to-Any

MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4 12B Multimodal Model

The new 12-billion-parameter open model from DeepMind introduces a unified 'any-to-any' architecture for advanced multimodal tasks.

May 23, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4, a 12B 'Any-to-Any' Model

The new 12-billion-parameter model from Google DeepMind is designed to handle a flexible mix of data types, moving beyond traditional text and image inputs.

May 23, 2026