The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

Company

Baidu

6 models

Releases

Baidu/Text → Video

Baidu Releases NAVA for Text-to-Video with Audio

The new model from the Chinese tech giant uses a Multimodal Diffusion Transformer to generate synchronized audio and video from text or image prompts.

May 29, 2026
Text → Video
NAVA
NAVA
Baidu/Text → Image

Baidu Releases 8B Text-to-Image Model ERNIE-Image

The large diffusion model from the Chinese tech giant is available under the commercially permissive Apache 2.0 license, a notable release for the community.

Apr 7, 2026
Text → Image
ERNIE-Image
ERNIE-Image
Baidu/Vision-Language

Baidu Releases Qianfan-OCR for Document Intelligence

The new vision-language model from the Chinese tech giant is designed for complex, multilingual optical character recognition and layout analysis.

Mar 18, 2026
Vision-Language
Qianfan-OCR
Qianfan-OCR
Baidu/Vision-Language

Baidu Releases Open VLM for Advanced Document OCR

The new PaddleOCR-VL model is built to parse not just text, but also the tables, formulas, and page layouts found in complex documents.

Jan 28, 2026
Vision-Language
PaddleOCR-VL-1.5
PaddleOCR-VL-1.5
Baidu/Image → Video

Baidu's Live-Avatar Animates Photos With Audio

The new 14-billion-parameter model uses audio input to generate realistic talking head videos from a single still image.

Dec 4, 2025
Image → Video
Live-Avatar
Live-Avatar
Baidu/Vision-Language

Baidu Releases PaddleOCR-VL for Document AI

The new vision-language model is fine-tuned to understand not just text, but the complex structure of tables, charts, and formulas.

Oct 16, 2025
Vision-Language
PaddleOCR-VL
PaddleOCR-VL