The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestQwen · AlibabaQwen3-Omni
Qwen · AlibabaAny-to-Any

Qwen3-Omni Arrives With Any-to-Any Multimodality

The new 30B Mixture-of-Experts model from Alibaba's Qwen team can process and generate content across text, image, and audio formats.

Sep 20, 2025
Major releaseOther
Qwen · Alibaba · Any-to-Any
Qwen3-Omni-30B-A3B-Instruct
Qwen3-Omni-30B-A3B-Instruct

Alibaba's Qwen team has released Qwen3-Omni, an ambitious new model family that pushes the boundaries of open multimodal AI. The first release, a 30-billion parameter instruction-tuned variant, is designed for "any-to-any" tasks, meaning it can natively process and generate content across text, vision, and audio domains.

This omni-modal capability sets it apart from typical open-source releases. While many models can interpret images and text, Qwen3-Omni can handle a wider range of tasks, including speech-to-text transcription, text-to-speech generation, and visual language understanding. This allows it to function as a more versatile and integrated assistant, capable of understanding a spoken query about an image and responding with a spoken answer.

Technical Details

The model is a Mixture of Experts (MoE) architecture with 30 billion total parameters, though only 8.7 billion are active during inference, offering a balance between capability and computational efficiency. According to its official model card, it's built to handle complex, interleaved inputs from different modalities.

Qwen3-Omni represents a significant step forward for developers building sophisticated, multi-sensory AI applications. However, potential users should note that the model is available under a custom license, not a standard open-source license like Apache 2.0 or MIT, which will require review for commercial use cases.

Sources

  • Qwen/Qwen3-Omni-30B-A3B-Instruct

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters30B · MoE
Context window—
LicenseOTHER
Downloads1.5M

Modalities

Any-to-AnyVision-LanguageText → SpeechSpeech → Text

More in Any-to-Any

MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4 12B Multimodal Model

The new 12-billion-parameter open model from DeepMind introduces a unified 'any-to-any' architecture for advanced multimodal tasks.

May 23, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4, a 12B 'Any-to-Any' Model

The new 12-billion-parameter model from Google DeepMind is designed to handle a flexible mix of data types, moving beyond traditional text and image inputs.

May 23, 2026