The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestOpenBMB4.5
OpenBMBVision-Language

OpenBMB Releases Compact Multimodal Model MiniCPM-V 4.5

The new vision-language model from the open-source research group demonstrates strong OCR and video understanding capabilities in a small package.

Aug 24, 2025
NotableOther
OpenBMB · Vision-Language
MiniCPM-V 4.5
MiniCPM-V 4.5

The open-source AI research group OpenBMB has released MiniCPM-V 4.5, a new and notably compact vision-language model (VLM). This model aims to deliver sophisticated multimodal understanding without requiring the massive computational resources often associated with leading-edge vision systems.

According to the release notes, the model demonstrates strong performance on tasks that have historically challenged even larger systems. Its key features include high-accuracy Optical Character Recognition (OCR), the ability to comprehend context across multiple images, and the capacity to understand video content—a significant step for a model in its size class.

Why it matters

The release of a smaller yet powerful VLM like MiniCPM-V is significant for developers working with limited hardware. Its efficiency opens up possibilities for on-device applications and more accessible multimodal AI research, lowering the barrier to entry for building sophisticated vision-based tools.

The model is now available for download and experimentation. Interested developers can find all the resources on the Hugging Face Hub. The model is available under a custom license, so users should review the terms before deployment in production environments.

Sources

  • openbmb/MiniCPM-V-4_5

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters—
Context window—
LicenseOTHER
Downloads63.8K

Modalities

Vision-Language

More in Vision-Language

Moonshot AI
Kimi-K2.7-Code
Kimi-K2.7-Code
Moonshot AI/Code

Moonshot AI Releases Kimi, a Multimodal Coding Model

The new Mixture-of-Experts model from the Chinese AI company can generate code while also understanding visual inputs, a rare combination in open models.

Jun 11, 2026
Google DeepMind
DiffusionGemma 26B-A4B Instruct
DiffusionGemma 26B-A4B Instruct
Google DeepMind/Text / LLM

Google Releases Open-Source DiffusionGemma 26B Model

The new 26B parameter model from DeepMind uses a diffusion-based architecture, a technique more common in image generation, to produce text.

Jun 9, 2026
MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026