Qwen · AlibabaText → Video

Alibaba's Wan2.2 Adds Control to Open Video

The new 14-billion-parameter model from Alibaba's PAI team offers fine-grained control over video generation using inputs like sketches and depth maps.

Sep 10, 2025

UpdateApache 2.0

Alibaba's Platform for AI (PAI) team has released a new open-source model for video generation, Wan2.2-VACE-Fun-A14B. This 14-billion-parameter model isn't just another text-to-video generator; its primary focus is on providing creators with a high degree of control over the final output.

The key differentiator for this Wan2.2 variant is its ability to condition video generation on more than just a text prompt. Users can provide structural inputs like depth maps, Canny edge outlines, sketches, and human pose skeletons to guide the creative process. This allows for much more precise control over scene composition and character movement than is possible with text alone.

Under the Hood

The model is part of the "Video-Audio-Caption-Editing" (VACE) project and uses a parameter-efficient method called "Fun-tuning" to adapt the base model for these specialized control tasks. According to the team, this approach makes it more efficient to train and adapt the model for specific creative needs.

Released under the permissive Apache 2.0 license, Wan2.2-VACE-Fun-A14B joins a growing field of open models that are making advanced video synthesis more accessible. By focusing on controllability, Alibaba is providing a valuable tool for developers and artists who need to move beyond simple prompts and direct their creative process with greater precision.

Sources

alibaba-pai/Wan2.2-VACE-Fun-A14B
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

MiniMax Releases H3 Video Model on Hugging Face

The company's new diffusion model handles text-to-video and image-to-video, with support for joint audio-video generation.

Jul 28, 2026

robbyant/Text → Video

LingBot-Video puts a 30B MoE behind embodied AI video

A DiT-based mixture-of-experts model activates just 3B parameters per step and ships under an Apache 2.0 license.

Jul 8, 2026

NVIDIA/Text → Video

NVIDIA's Cosmos 3 Edge Brings World Models Closer

A new edge-optimized variant of NVIDIA's Cosmos world-model line aims to run generative video where the compute lives.

Jul 1, 2026

Under the Hood