Quark VisionImage → Video

Baidu's Live-Avatar Animates Photos With Audio

The new 14-billion-parameter model uses audio input to generate realistic talking head videos from a single still image.

Dec 4, 2025

UpdateApache 2.0

Baidu's Quark Vision team has released Live-Avatar, an open-source model that animates a still photograph into a talking head video using a separate audio track. The project aims to create realistic, audio-driven digital avatars from a single source image, a common but challenging task in generative AI.

The 14-billion-parameter model is built upon a foundation model called Wan2.2-S2V-14B, which specializes in still-image-to-video generation. Live-Avatar is fine-tuned for the specific task of synchronizing lip movements and generating natural head motions that correspond to the cadence and content of the provided audio input.

While audio-driven avatar technology has been explored extensively in commercial applications, the release of a powerful open-source model like Live-Avatar under a permissive Apache 2.0 license is significant. It provides researchers and developers with a strong baseline for creating virtual assistants, enhancing accessibility tools, or powering new forms of digital content creation.

The model, code, and usage instructions are now available for developers to explore on the Hugging Face Hub. The repository includes examples demonstrating the model's ability to generate coherent and expressive video from a variety of portrait images.

Sources

Quark-Vision/Live-Avatar
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

MiniMax Releases H3 Video Model on Hugging Face

The company's new diffusion model handles text-to-video and image-to-video, with support for joint audio-video generation.

Jul 28, 2026

Qwen · Alibaba/Image → Video

Wan-Dancer-14B turns still images into dance videos

Alibaba's Wan team releases an Apache-2.0 image-to-video model built for music-driven dance generation.

Jul 10, 2026

NVIDIA/Text → Video

NVIDIA's Cosmos 3 Edge Brings World Models Closer

A new edge-optimized variant of NVIDIA's Cosmos world-model line aims to run generative video where the compute lives.

Jul 1, 2026