Tencent Releases HY-OmniWeaving for Multi-Image Video
Built on their HunyuanVideo-1.5 architecture, the new model synthesizes video by combining multiple static images and text prompts into a cohesive narrative.

Tencent has released HY-OmniWeaving, an open-source model designed to generate video from a combination of multiple images and text prompts. Developed by the Tencent Hunyuan team, the model represents a step beyond single-image animation, focusing instead on creating dynamic video sequences from several static sources.
Unlike many image-to-video models that animate a single input, HY-OmniWeaving is engineered to "weave" together different images into a coherent story. According to the release on Hugging Face, the model is based on the team's existing HunyuanVideo-1.5 model. Users can provide a series of images and use text to guide the motion and transitions between them, effectively directing a short, multi-shot scene.
From Static to Sequence
The model's core capability is its ability to interpret spatial and narrative relationships between distinct images. This allows it to create more complex and engaging video content than simple motion effects. Key features include:
- Multi-image input: The model can synthesize a single video from several different source images.
- Text-guided narrative: Text prompts control the action and flow of the generated video sequence.
- Temporal consistency: It aims to maintain a consistent look and feel as it transitions between elements from different source images.
This approach opens up new possibilities for generative video, moving from simple animation to programmatic storytelling. Potential applications include creating short narratives from storyboards, generating dynamic product slideshows, or animating comic panels. The model is available now for developers and researchers to explore.
Sources
- Visit
tencent/HY-OmniWeaving
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Image → Video

Zhipu AI Releases SCAIL-2 for Character Animation
The new open-source diffusion model from the company's research arm generates video clips from a single character image and a sequence of poses.

NVIDIA Releases Cosmos3 Image-to-Video World Model
The latest release in NVIDIA's 'world model' research family aims to generate coherent and realistic video from a single static image.
NVIDIA Releases SANA, a Camera-Controllable Video Model
The new model, SANA-WM, uses a bidirectional diffusion process to give creators fine-grained control over camera movement and video editing.