Qwen releases flagship 397B multimodal MoE
The new open-source model from Alibaba uses a Mixture-of-Experts architecture to balance massive scale with efficient inference.
The Qwen team at Alibaba has released Qwen3.5-397B-A17B, a powerful new open-source model that pushes the boundaries of scale and efficiency. As detailed on its Hugging Face repository, the model features a staggering 397 billion total parameters, making it one of the largest open models available.
A Sparse Architecture at Scale
What makes this scale manageable is its Mixture-of-Experts (MoE) architecture. Instead of activating all 397 billion parameters for every task, the model intelligently routes queries through a smaller subset, using only 17 billion active parameters at inference time. This "sparse" approach allows for the vast knowledge capacity of a huge model while keeping computational demands relatively low.
Beyond its scale, Qwen3.5 is also a capable vision-language model (VLM). This multimodal capability means it can process and understand both text and images, enabling more complex applications in areas like image captioning, visual question answering, and content analysis.
Released under the permissive Apache 2.0 license, Qwen3.5-397B-A17B represents a significant contribution to the open-source AI ecosystem. By providing access to a flagship-class MoE model, Alibaba is enabling developers and researchers to build on top of state-of-the-art multimodal AI without the need to train such a massive model from scratch.
Sources
- Visit
Qwen/Qwen3.5-397B-A17B
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Vision-Language
Moonshot AI Releases Kimi, a Multimodal Coding Model
The new Mixture-of-Experts model from the Chinese AI company can generate code while also understanding visual inputs, a rare combination in open models.
Google Releases Open-Source DiffusionGemma 26B Model
The new 26B parameter model from DeepMind uses a diffusion-based architecture, a technique more common in image generation, to produce text.

MiniMax Releases M3, a Multimodal MoE Model
The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.