The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestZhipu AIOCR
Zhipu AIVision-Language

Zhipu AI Releases Multilingual GLM-OCR Vision Model

The new vision-language model from the creators of the GLM series is specialized for recognizing and extracting text from images across multiple languages.

Jan 30, 2026
NotableOther
Zhipu AI · Vision-Language
GLM-OCR
GLM-OCR

Zhipu AI, the company behind the prominent GLM series of large language models, has released a new open-source model focused on a classic computer vision task: optical character recognition (OCR). The new model, called GLM-OCR, is a vision-language model (VLM) designed specifically to identify and extract text embedded in images.

The key feature of GLM-OCR is its multilingual capability. According to the project's official release page, the model is trained to handle text in Chinese, English, Korean, and Japanese, making it a potentially valuable tool for applications that need to process documents and images from across East Asia and the English-speaking world. You can find the model and usage instructions on its Hugging Face repository.

Why it matters

High-quality OCR is a foundational technology for digitizing documents, parsing user interfaces, and powering accessibility tools. While powerful OCR services are available through proprietary APIs, strong open-source alternatives empower developers to build applications with more privacy and control. GLM-OCR provides a new, specialized tool for this purpose, particularly for developers working with multilingual content.

While Zhipu AI has released the model weights, potential users should note the license. The model is available under a custom license that places limitations on its use for online services, a key distinction from more permissive licenses like Apache 2.0. This restricts its use in certain commercial applications, so developers should review the terms carefully before integrating it into their projects.

Sources

  • zai-org/GLM-OCR

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters—
Context window—
LicenseOTHER
Downloads2.6M

Modalities

Vision-Language

More in Vision-Language

Moonshot AI
Kimi-K2.7-Code
Kimi-K2.7-Code
Moonshot AI/Code

Moonshot AI Releases Kimi, a Multimodal Coding Model

The new Mixture-of-Experts model from the Chinese AI company can generate code while also understanding visual inputs, a rare combination in open models.

Jun 11, 2026
Google DeepMind
DiffusionGemma 26B-A4B Instruct
DiffusionGemma 26B-A4B Instruct
Google DeepMind/Text / LLM

Google Releases Open-Source DiffusionGemma 26B Model

The new 26B parameter model from DeepMind uses a diffusion-based architecture, a technique more common in image generation, to produce text.

Jun 9, 2026
MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026