Qwen Releases Flagship 122B Multimodal MoE Model
The new Qwen3.5-122B-A10B combines a massive parameter count with an efficient Mixture-of-Experts architecture for advanced vision and language tasks.
Alibaba's Qwen team has unveiled Qwen3.5-122B-A10B, the flagship model in its new Qwen3.5 series. This release marks a significant entry into the high-end open-source model space, targeting complex multimodal tasks with a powerful and efficient architecture.
The model's main innovation is its scale combined with its structure. It is a 122-billion-parameter Mixture-of-Experts (MoE) model, but only activates 10 billion parameters for any given input. This design provides the knowledge capacity of a very large model while keeping the computational cost for inference relatively low, making it more accessible than a dense model of a similar size.
A Vision-Language Specialist
As a vision-language model (VLM), Qwen3.5-122B-A10B is designed to understand and process information from both images and text simultaneously. This makes it suitable for a wide range of applications, including visual question answering, image captioning, and object recognition within a broader textual context.
In keeping with the project's commitment to open-source development, the model is available under the permissive Apache 2.0 license, which allows for both academic and commercial use. Developers can access the model weights and explore its capabilities on the Hugging Face Hub.
Sources
- Visit
Qwen/Qwen3.5-122B-A10B
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Vision-Language
Moonshot AI Releases Kimi, a Multimodal Coding Model
The new Mixture-of-Experts model from the Chinese AI company can generate code while also understanding visual inputs, a rare combination in open models.
Google Releases Open-Source DiffusionGemma 26B Model
The new 26B parameter model from DeepMind uses a diffusion-based architecture, a technique more common in image generation, to produce text.

MiniMax Releases M3, a Multimodal MoE Model
The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.