The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestinclusionAI1.5
inclusionAIAny-to-Any

Ming-Lite-Omni 1.5 Brings Any-to-Any Modality to Open Source

The new MIT-licensed model from inclusionAI can process and generate a mix of text, images, audio, and video, pushing the boundaries of open multimodal AI.

Jul 15, 2025
NotableMIT
inclusionAI · Any-to-Any
Ming-Lite-Omni 1.5
Ming-Lite-Omni 1.5

Startup inclusionAI has released Ming-Lite-Omni 1.5, a new open-source model designed to handle a wide array of data types simultaneously. Published under a permissive MIT license, the model aims to provide "any-to-any" omni-modal capabilities, a significant step forward for generalized AI research and development. The model and its components are available now on Hugging Face.

Unlike many multimodal models that operate on a fixed input-to-output path (like text-to-image), an omni-modal system is designed to fluidly process and generate content across various formats. Ming-Lite-Omni can reportedly understand and create content using text, images, audio, and video, allowing for more complex and integrated AI applications.

A Flexible Foundation for Multimodal AI

The model's true significance lies in its combination of advanced architecture and an unrestrictive license. This opens the door for developers and researchers to experiment with sophisticated multimodal tasks that have largely been the domain of closed, proprietary systems. Potential applications could include:

  • Generating a video with a descriptive soundtrack from a single text prompt.
  • Creating a detailed textual summary of an audio-visual recording.
  • Answering questions about a video by analyzing both its frames and its spoken audio.

While specific benchmarks have not been released, the "Lite" designation in its name suggests that Ming-Lite-Omni may be a more computationally accessible version of this complex technology. Its release provides a valuable new tool for building the next generation of AI that can see, hear, and communicate in multiple dimensions.

Sources

  • inclusionAI/Ming-Lite-Omni-1.5

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters—
Context window—
LicenseMIT
Downloads213

Modalities

Any-to-Any

More in Any-to-Any

MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4 12B Multimodal Model

The new 12-billion-parameter open model from DeepMind introduces a unified 'any-to-any' architecture for advanced multimodal tasks.

May 23, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4, a 12B 'Any-to-Any' Model

The new 12-billion-parameter model from Google DeepMind is designed to handle a flexible mix of data types, moving beyond traditional text and image inputs.

May 23, 2026