Zhipu AIImage → Video

Zhipu AI Releases SCAIL-2 for Character Animation

The new open-source diffusion model from the company's research arm generates video clips from a single character image and a sequence of poses.

Jun 9, 2026

NotableMIT

Chinese AI firm Zhipu AI, through its research arm zai, has released SCAIL-2, an open-source model designed for a specific and challenging task: character animation. The new diffusion model can take a single static image of a character and bring it to life as a short video clip, following a user-provided sequence of poses.

The model works by conditioning its video generation on two key inputs: the reference image of the character and a control video representing the desired motion, typically as a skeletal pose estimation. This method gives creators granular control over the final animation, allowing them to precisely direct the character's movements rather than relying on a simple text prompt.

Why It Matters

While many recent open video models focus on general-purpose text-to-video generation, SCAIL-2 provides a specialized tool for animators, game developers, and creative technologists. By focusing on pose-driven control, it opens up new workflows for creating character-centric content with a high degree of consistency and directorial input.

Released under the permissive MIT license, SCAIL-2 allows for broad adoption and commercial use, encouraging developers to integrate it into new applications and build upon the core technology. The model and usage instructions are available on the official zai organization page on Hugging Face.

Sources

zai-org/SCAIL-2
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

MiniMax Releases H3 Video Model on Hugging Face

The company's new diffusion model handles text-to-video and image-to-video, with support for joint audio-video generation.

Jul 28, 2026

Qwen · Alibaba/Image → Video

Wan-Dancer-14B turns still images into dance videos

Alibaba's Wan team releases an Apache-2.0 image-to-video model built for music-driven dance generation.

Jul 10, 2026

NVIDIA/Text → Video

NVIDIA's Cosmos 3 Edge Brings World Models Closer

A new edge-optimized variant of NVIDIA's Cosmos world-model line aims to run generative video where the compute lives.

Jul 1, 2026

Why It Matters