Latest open-source Image → Video models

MiniMax/Text → Video

MiniMax Releases H3 Video Model on Hugging Face

The company's new diffusion model handles text-to-video and image-to-video, with support for joint audio-video generation.

Jul 28, 2026

Image → Video Text → Video

Qwen · Alibaba/Image → Video

Wan-Dancer-14B turns still images into dance videos

Alibaba's Wan team releases an Apache-2.0 image-to-video model built for music-driven dance generation.

Jul 10, 2026

Image → Video

NVIDIA/Text → Video

NVIDIA's Cosmos 3 Edge Brings World Models Closer

A new edge-optimized variant of NVIDIA's Cosmos world-model line aims to run generative video where the compute lives.

Jul 1, 2026

Image → Video Text → Video

Open Gigaai/Image → Video

GigaAI Releases Giga-World-1 Under Apache 2.0

An open image-to-video world model aims to bridge physically grounded generation and robot policy learning.

Jul 1, 2026

Image → Video

Zhipu AI/Image → Video

Zhipu AI Releases SCAIL-2 for Character Animation

The new open-source diffusion model from the company's research arm generates video clips from a single character image and a sequence of poses.

Jun 9, 2026

Image → Video

NVIDIA/Image → Video

NVIDIA Releases Cosmos3 Image-to-Video World Model

The latest release in NVIDIA's 'world model' research family aims to generate coherent and realistic video from a single static image.

May 21, 2026

Image → Video

Meituan/Image → Video

Meituan releases LongCat-Video-Avatar 1.5

An audio-driven avatar model that animates still images into talking video, with support for continuation of longer clips.

May 21, 2026

Image → Video

NVIDIA/Image → Video

NVIDIA Releases SANA, a Camera-Controllable Video Model

The new model, SANA-WM, uses a bidirectional diffusion process to give creators fine-grained control over camera movement and video editing.

May 18, 2026

Image → Video Text → Video

Lightricks/Image → Video

Lightricks Releases LoRA for AI Lip-Dubbing

The new 'Identity-Control' adapter fine-tunes the company's LTX-2.3 video model to create realistic lip-syncing for dubbing workflows.

May 11, 2026

Image → Video Text → Video

Motif Technologies/Text → Video

Motif Releases 2B Open-Source Text-to-Video Model

The new Apache 2.0 licensed model uses a diffusion transformer architecture to offer a new open alternative for video generation research.

Apr 14, 2026

Image → Video Text → Video

Tencent/Image → Video

Tencent Releases HY-OmniWeaving for Multi-Image Video

Built on their HunyuanVideo-1.5 architecture, the new model synthesizes video by combining multiple static images and text prompts into a cohesive narrative.

Mar 31, 2026

Image → Video Text → Video

GAIR/Image → Video

GAIR Releases daVinci-MagiHuman for Video Generation

The new open-source model from the General Artificial Intelligence Research team can create video clips complete with audio from a variety of inputs.

Mar 21, 2026

Image → Video Any-to-Any

Lightricks/Image → Video

Lightricks LTX-2.3 Generates Video and Audio Together

The new model, based on Stable Video Diffusion, can create video and a corresponding soundtrack simultaneously from text, image, or audio prompts.

Mar 4, 2026

Image → Video Text → Video

OpenMOSS/Any-to-Any

OpenMOSS Releases MOVA, a 720p Multimodal Video Generator

The new open model can generate high-definition video with synchronized audio from a flexible combination of text and image prompts.

Jan 28, 2026

Image → Video Any-to-Any

OpenMOSS/Image → Video

OpenMOSS Releases MOVA for Joint Video and Audio Gen

The new model generates 360p video from text or images and creates corresponding audio tracks simultaneously, a notable step for integrated audiovisual synthesis.

Jan 28, 2026

Image → Video Text → Video

robbyant/Image → Video

Lingbot-World Animates Images with Camera Control

The new open-source world model from researcher robbyant generates short video clips from a single image, giving users control over the virtual camera path.

Jan 26, 2026

Image → Video

Lightricks/Image → VideoMajor release

Lightricks Releases LTX-2 Multimodal Video Generator

The new diffusion model from the creative app company can generate short video clips from text, images, audio, and even other videos.

Jan 3, 2026

Image → Video Text → Video

Huaichang/Image → Video

PersonaLive Model Animates Portraits in Real Time

The new open-source model from OpenBMB uses a diffusion-based architecture to generate expressive video from a single still image.

Dec 13, 2025

Image → Video

Tencent/Image → Video

Tencent's HY-WorldPlay Creates 3D Scenes from One Image

The new model from Tencent's Hunyuan team generates dynamic video and reconstructs 3D environments using a single static picture.

Dec 12, 2025

Image → Video Text → 3D

Quark Vision/Image → Video

Baidu's Live-Avatar Animates Photos With Audio

The new 14-billion-parameter model uses audio input to generate realistic talking head videos from a single still image.

Dec 4, 2025

Image → Video

Tencent/Text → VideoMajor release

Tencent Releases HunyuanVideo 1.5 Generation Model

The new diffusion model generates short video clips from text and image prompts, adding another major player to the open video space.

Nov 18, 2025

Image → Video Text → Video

Meituan/Text → Video

Meituan Releases Open-Source LongCat-Video Model

The Chinese tech giant has released a new MIT-licensed model capable of generating video from text, images, or by continuing existing clips.

Oct 24, 2025

Image → Video Text → Video

EPFL VITA/Image → Video

EPFL Releases SVI for Streaming Image-to-Video

The new open-source model from Swiss researchers uses a novel chunking method to generate indefinitely long videos from a single still image.

Oct 8, 2025

Image → Video

chetwinlow1/Image → Video

Ovi Syncs Audio and Video in New Open-Source Model

Built on the Wan2.2 architecture, this new 5-billion-parameter model generates short video clips from a single image and simultaneously creates synchronized audio.

Sep 30, 2025

Image → Video

ByteDance/Image → Video

ByteDance Releases Lynx for Identity-Preserving Video

The new model from the TikTok parent company generates short video clips that maintain a person's likeness from a single reference image.

Sep 26, 2025

Image → Video

ByteDance/Image → Video

ByteDance Releases HuMo for Human Video Generation

The new open-source model specializes in creating realistic videos of people, separating appearance from motion for greater control.

Sep 10, 2025

Image → Video

Tencent/Image → Video

Tencent's Voyager Model Turns Images into 3D Worlds

The new model from Tencent AI Lab generates temporally and spatially consistent video sequences from a single image, enabling virtual exploration of static scenes.

Aug 27, 2025

Image → Video Text → 3D

Qwen · Alibaba/Image → Video

Alibaba Releases 14B Model for Audio-Driven Video

The new Wan2.2-S2V model takes a still image and a speech track to generate a realistic talking-head animation, available under a permissive license.

Aug 25, 2025

Image → Video

Tencent/Image → Video

Tencent Releases Controllable Game Video Model

The new Hunyuan-GameCraft 1.0 is an open image-to-video model that generates interactive game-like scenes with precise camera control.

Aug 13, 2025

Image → Video

FrancisRing/Image → Video

StableAvatar Brings Open Source Talking Heads to Life

A new diffusion-based model from developer FrancisRing animates still images into talking avatars using only an audio track.

Aug 12, 2025

Image → Video

Skywork/Image → Video

Skywork Releases Open 'World Model' for Playable Video

The new 1.3-billion-parameter model functions as an interactive 'world model,' generating controllable video scenes from a single static image.

Aug 8, 2025

Image → Video

Qwen · Alibaba/Image → VideoMajor release

Alibaba Releases Wan2.2, a 14B MoE Video Model

The new open-source diffusion model from the team behind Qwen uses a Mixture-of-Experts architecture to animate still images.

Jul 28, 2025

Image → Video

Qwen · Alibaba/Text → Video

Qwen Releases Wan2.2, a 5B Open-Source Video Model

The new Apache 2.0 licensed model from Alibaba's team generates video from either text prompts or still images, offering a unified approach in a compact package.

Jul 28, 2025

Image → Video Text → Video

Qwen · Alibaba/Image → Video

Qwen Releases Wan2.2, a 14B Image-to-Video Model

The new 14-billion parameter model from Alibaba's AI team uses a Mixture-of-Experts design and is available under the permissive Apache 2.0 license.

Jul 24, 2025

Image → Video

Qwen · Alibaba/Text → VideoMajor release

Qwen Releases Wan 2.2, a 5B Open Video AI Model

The new Apache 2.0 licensed model from Alibaba's team can generate video from both text and image prompts, adding a powerful new tool to the open-source creative ecosystem.

Jul 18, 2025

Image → Video Text → Video

RaphaelLiu/Image → Video

Pusa V1: A New Open Model for Image-to-Video Animation

Based on the Wan2.1 architecture, this new 14B parameter model offers fine-grained control over video generation from still images and text.

Jul 14, 2025

Image → Video Text → Video

Latest Image → Video models