LightOn Releases OCR-2, a 1B Document AI Model
The new vision model from the Paris-based AI lab uses Mistral architecture to extract text and structure from complex documents like PDFs and forms.

Parisian AI company LightOn has released LightOnOCR-2, a new 1-billion-parameter vision language model specialized in document understanding. The model is designed to perform optical character recognition (OCR) on complex documents, extracting not just text but also structural information.
Unlike simple OCR tools, LightOnOCR-2 is built to parse challenging layouts like tables, forms, and multi-column PDFs. This makes it suitable for enterprise automation tasks such as processing invoices or digitizing records, a domain often dominated by proprietary, API-gated services.
A Mistral-based Architecture
The model features a Transformer-based encoder-decoder architecture. In an interesting design choice, its decoder was initialized using a subset of weights from Mistral-7B-v0.1, allowing it to leverage the powerful language capabilities of the popular open model while maintaining a much smaller, more efficient footprint.
The complete model weights and code are available on the Hugging Face Hub for developers to download and use. It's released under a custom LightOnAI-OpenRAIL-M license, which permits commercial use but includes some use-case restrictions common to Responsible AI licenses.
Sources
- Visit
lightonai/LightOnOCR-2-1B
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Vision-Language
Moonshot AI Releases Kimi, a Multimodal Coding Model
The new Mixture-of-Experts model from the Chinese AI company can generate code while also understanding visual inputs, a rare combination in open models.
Google Releases Open-Source DiffusionGemma 26B Model
The new 26B parameter model from DeepMind uses a diffusion-based architecture, a technique more common in image generation, to produce text.

MiniMax Releases M3, a Multimodal MoE Model
The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.