The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestBaidu1.0
BaiduVision-Language

Baidu Releases Qianfan-OCR for Document Intelligence

The new vision-language model from the Chinese tech giant is designed for complex, multilingual optical character recognition and layout analysis.

Mar 18, 2026
NotableOther
Baidu · Vision-Language
Qianfan-OCR
Qianfan-OCR

Chinese technology company Baidu has released Qianfan-OCR, a new vision-language model specialized for optical character recognition and document understanding. The model is aimed at developers who need to extract text and structural information from complex documents across multiple languages.

As a document intelligence model, Qianfan-OCR is designed to go beyond simple text transcription. Its capabilities include recognizing tables, analyzing page layouts, and handling a wide variety of languages. This makes it suitable for digitizing complex materials like invoices, structured forms, and academic papers that mix text with other elements.

A New Tool for Digitization

The release adds a powerful new option to the growing ecosystem of open models for document processing. Baidu's entry provides a strong multilingual solution for enterprise and archival applications where documents often contain complex formatting. This is a critical task for businesses looking to automate data entry and researchers digitizing large volumes of text.

The model weights and usage instructions are available on the Hugging Face Hub. Potential users should note that it is released under a custom End User License Agreement, which may place restrictions on certain use cases compared to more permissive open-source licenses.

Sources

  • baidu/Qianfan-OCR

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters—
Context window—
LicenseOTHER
Downloads202.8K

Modalities

Vision-Language

More in Vision-Language

Moonshot AI
Kimi-K2.7-Code
Kimi-K2.7-Code
Moonshot AI/Code

Moonshot AI Releases Kimi, a Multimodal Coding Model

The new Mixture-of-Experts model from the Chinese AI company can generate code while also understanding visual inputs, a rare combination in open models.

Jun 11, 2026
Google DeepMind
DiffusionGemma 26B-A4B Instruct
DiffusionGemma 26B-A4B Instruct
Google DeepMind/Text / LLM

Google Releases Open-Source DiffusionGemma 26B Model

The new 26B parameter model from DeepMind uses a diffusion-based architecture, a technique more common in image generation, to produce text.

Jun 9, 2026
MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026