The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestDeepSeekOCR-2
DeepSeekVision-Language

DeepSeek-OCR-2 Tackles Multilingual Document AI

The new open vision-language model is designed to extract text and understand structure from complex, multilingual documents.

Jan 27, 2026
NotableOther
DeepSeek · Vision-Language
DeepSeek-OCR-2
DeepSeek-OCR-2

AI company DeepSeek has released DeepSeek-OCR-2, a powerful vision-language model specialized for Optical Character Recognition (OCR). The model is designed to go beyond simple text extraction, aiming to provide a deeper understanding of document structure and content across multiple languages.

Unlike traditional OCR tools that follow a rigid pipeline, DeepSeek-OCR-2 operates as a vision-language model. It processes an image of a document and a user's prompt to generate structured output, allowing it to handle complex layouts, tables, and mixed-language text found in real-world documents like invoices, forms, and academic papers.

A New Open Alternative

The release of DeepSeek-OCR-2 on Hugging Face provides developers with a strong open-source alternative to proprietary document intelligence APIs from major cloud providers. Its key capabilities include:

  • Multilingual Support: Handles a wide range of languages within the same document.
  • Layout Understanding: Recognizes and preserves the structure of tables and multi-column text.
  • Versatility: Processes both scanned and digitally-born documents effectively.

The model is available under a custom license that permits commercial use, though it includes restrictions against using the model to create competing products. This move gives developers and businesses a new, powerful tool for building applications that require sophisticated document processing without relying on closed, pay-per-use services.

Sources

  • deepseek-ai/DeepSeek-OCR-2

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters—
Context window—
LicenseOTHER
Downloads1.7M

Modalities

Vision-Language

More in Vision-Language

Moonshot AI
Kimi-K2.7-Code
Kimi-K2.7-Code
Moonshot AI/Code

Moonshot AI Releases Kimi, a Multimodal Coding Model

The new Mixture-of-Experts model from the Chinese AI company can generate code while also understanding visual inputs, a rare combination in open models.

Jun 11, 2026
Google DeepMind
DiffusionGemma 26B-A4B Instruct
DiffusionGemma 26B-A4B Instruct
Google DeepMind/Text / LLM

Google Releases Open-Source DiffusionGemma 26B Model

The new 26B parameter model from DeepMind uses a diffusion-based architecture, a technique more common in image generation, to produce text.

Jun 9, 2026
MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026