The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestTencent1.0
TencentVision-Language

Tencent Releases 1B Parameter HunyuanOCR Model

The new vision-language model from Tencent Hunyuan offers a compact, end-to-end solution for optical character recognition.

Nov 18, 2025
NotableOther
Tencent · Vision-Language
HunyuanOCR
HunyuanOCR

Tencent has released HunyuanOCR, a new vision-language model specialized for reading text in images. At a relatively compact one billion parameters, the model provides an efficient, open-source tool for developers working on optical character recognition (OCR) tasks.

HunyuanOCR uses an end-to-end architecture, which simplifies the traditional OCR pipeline. Instead of first detecting text boxes and then separately recognizing the characters inside them, the model processes the entire task in a single step. This integrated approach can improve performance on challenging inputs like dense documents or text in natural scenes.

The model's capabilities are suited for a range of applications, including document digitization, extracting information from forms, and reading text from real-world photos like street signs or product labels. All model assets are available on the Hugging Face Hub under a permissive Apache 2.0 license, encouraging both research and commercial use.

This release from the Tencent Hunyuan team reflects a growing industry trend of releasing smaller, specialized models. While massive general-purpose models attract headlines, focused tools like HunyuanOCR provide a practical and efficient solution for developers needing to solve a specific, common problem.

Sources

  • tencent/HunyuanOCR

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters1B
Context window—
LicenseOTHER
Downloads258.9K

Modalities

Vision-Language

More in Vision-Language

Moonshot AI
Kimi-K2.7-Code
Kimi-K2.7-Code
Moonshot AI/Code

Moonshot AI Releases Kimi, a Multimodal Coding Model

The new Mixture-of-Experts model from the Chinese AI company can generate code while also understanding visual inputs, a rare combination in open models.

Jun 11, 2026
Google DeepMind
DiffusionGemma 26B-A4B Instruct
DiffusionGemma 26B-A4B Instruct
Google DeepMind/Text / LLM

Google Releases Open-Source DiffusionGemma 26B Model

The new 26B parameter model from DeepMind uses a diffusion-based architecture, a technique more common in image generation, to produce text.

Jun 9, 2026
MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026