Qwen Unveils Wan2.2, a 14B Open Text-to-Video Model
The new Apache 2.0-licensed model from Alibaba's team uses a Mixture-of-Experts architecture for efficient, high-quality video generation.

The Qwen team at Alibaba has introduced a significant new open-source model for generating video from text prompts, called Wan2.2-T2V-A14B. This release expands the team's portfolio of powerful, openly accessible foundation models.
What sets this model apart is its use of a Mixture-of-Experts (MoE) architecture. It features 14 billion active parameters, meaning only a fraction of the model's total size is engaged for any given task. This design aims to deliver the performance of a much larger model while keeping the computational cost of inference more manageable.
The release of Wan2.2 is a notable event in the competitive landscape of generative video. By making the model available under the permissive Apache 2.0 license, the Qwen team provides researchers and developers with a powerful, unrestricted tool for building new applications and pushing the boundaries of video synthesis.
The model is now available for download and experimentation on the Hugging Face Hub, allowing the community to begin exploring its capabilities immediately.
Sources
- Visit
Wan-AI/Wan2.2-T2V-A14B
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Text → Video

JD.com Enters Open-Source AI Video with JoyAI-Echo
The Chinese e-commerce giant has released a new model capable of generating long-form, multi-shot videos with synchronized audio from text prompts.

Baidu Releases NAVA for Text-to-Video with Audio
The new model from the Chinese tech giant uses a Multimodal Diffusion Transformer to generate synchronized audio and video from text or image prompts.
NVIDIA Releases SANA, a Camera-Controllable Video Model
The new model, SANA-WM, uses a bidirectional diffusion process to give creators fine-grained control over camera movement and video editing.