The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestOpenBMB4.5
OpenBMBAny-to-Any

OpenBMB Releases 'Any-to-Any' Multimodal Model

The new MiniCPM-o 4.5 model from the open-source research group can process and generate interleaved combinations of images, text, and audio.

Feb 3, 2026
NotableOther
OpenBMB · Any-to-Any
MiniCPM-o 4.5
MiniCPM-o 4.5

The open-source AI community OpenBMB has released MiniCPM-o 4.5, a new model that significantly expands the possibilities for multimodal interaction. Unlike many models that process one type of input to produce a single type of output, MiniCPM-o is designed for "any-to-any" communication, capable of handling a mix of text, images, and audio in a single conversational flow.

This approach aims to create more natural and fluid interactions with AI. The model's "full-duplex" support suggests it can understand interleaved inputs—for example, a user could provide an image, ask a question in text, and follow up with a spoken clarification. In response, the model could generate its own combination of text, a new image, and synthesized speech.

Why It Matters

This release represents a move beyond simple, turn-based tasks like image captioning. It points toward AI systems that can participate in dynamic, multi-format conversations. By handling various data streams simultaneously, MiniCPM-o could power more sophisticated applications in areas like:

  • Interactive educational tools
  • Advanced accessibility software
  • Complex creative and design assistants

While technical details like parameter count were not specified in the release record, the model's architecture itself is the key development. Researchers can explore its capabilities directly, as it is available on Hugging Face. The model provides an open-source foundation for building the next generation of conversational AI agents.

Sources

  • openbmb/MiniCPM-o-4_5

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters—
Context window—
LicenseOTHER
Downloads19.4K

Modalities

Any-to-AnyVision-Language
2 versions — view changelog

More in Any-to-Any

MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4 12B Multimodal Model

The new 12-billion-parameter open model from DeepMind introduces a unified 'any-to-any' architecture for advanced multimodal tasks.

May 23, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4, a 12B 'Any-to-Any' Model

The new 12-billion-parameter model from Google DeepMind is designed to handle a flexible mix of data types, moving beyond traditional text and image inputs.

May 23, 2026