The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestAlpha-VLLMLumina-DiMOO
Alpha-VLLMAny-to-Any

Lumina-DiMOO: A Diffusion Model for Any-to-Any AI

This new open-source model uses a diffusion architecture instead of a typical transformer to generate and understand a mix of media types.

Sep 9, 2025
NotableApache 2.0
Alpha-VLLM · Any-to-Any
Lumina-DiMOO
Lumina-DiMOO

A new multimodal model named Lumina-DiMOO has been released, offering a different architectural approach to the increasingly common "any-to-any" AI systems. Published by the research group Alpha-VLLM under a permissive Apache 2.0 license, the model is designed to both understand and generate content across different data types.

A Diffusion-Based Approach

Unlike many popular large language models that rely on a standard transformer architecture, Lumina-DiMOO is built as a diffusion-based LLM. This technique, commonly associated with leading text-to-image generators, creates outputs by progressively refining noise into a coherent result. Applying this to general multimodal tasks represents a notable path for research beyond autoregressive models.

The model's "any-to-any" promise suggests a high degree of flexibility, allowing for various combinations of inputs and outputs. This could enable applications like generating images from detailed text, answering questions about an image, or other complex cross-modal tasks. This versatility makes it a potential foundation for more integrated and context-aware AI.

By exploring an alternative to dominant transformer systems, Lumina-DiMOO provides the open-source community with a new framework for building multimodal AI. The model and its components are available for researchers and developers to explore on Hugging Face.

Sources

  • Alpha-VLLM/Lumina-DiMOO

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters—
Context window—
LicenseAPACHE-2.0
Downloads1.3K

Modalities

Any-to-AnyText → Image

More in Any-to-Any

MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4 12B Multimodal Model

The new 12-billion-parameter open model from DeepMind introduces a unified 'any-to-any' architecture for advanced multimodal tasks.

May 23, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4, a 12B 'Any-to-Any' Model

The new 12-billion-parameter model from Google DeepMind is designed to handle a flexible mix of data types, moving beyond traditional text and image inputs.

May 23, 2026