TencentImage → Video

Tencent's HY-WorldPlay Creates 3D Scenes from One Image

The new model from Tencent's Hunyuan team generates dynamic video and reconstructs 3D environments using a single static picture.

Dec 12, 2025

NotableOther

Tencent has introduced HY-WorldPlay, an AI model designed to generate interactive video and 3D assets from a single image. As part of the company's broader Hunyuan AI initiative, this release pushes beyond simple video generation into the realm of creating dynamic, explorable digital scenes.

Unlike many image-to-video models that create short, fixed animations, HY-WorldPlay functions as an interactive world model. It interprets a static image to build a coherent 3D environment, which can then be used to generate long, consistent videos. Users can also guide the camera and action with text prompts, effectively "playing" within the newly created world.

From Image to Interactive World

The model's core capabilities focus on two distinct but related tasks:

Scene Generation: Creating extended video sequences that maintain temporal and spatial consistency, based on an initial image.
3D Reconstruction: Inferring the geometry and texture of the scene to build a 3D model from the 2D input.

This technology points toward more sophisticated generative tools for creating content for games, simulations, and virtual reality. By building a foundational understanding of a scene's 3D structure, models like HY-WorldPlay can produce more dynamic and controllable outputs than those that simply manipulate pixels.

The model is now available for download on Hugging Face, though its use is restricted. It's governed by a custom license agreement that limits its application to non-commercial research purposes, a common approach for foundational model releases.

Sources

tencent/HY-WorldPlay
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

MiniMax Releases H3 Video Model on Hugging Face

The company's new diffusion model handles text-to-video and image-to-video, with support for joint audio-video generation.

Jul 28, 2026

Qwen · Alibaba/Image → Video

Wan-Dancer-14B turns still images into dance videos

Alibaba's Wan team releases an Apache-2.0 image-to-video model built for music-driven dance generation.

Jul 10, 2026

NVIDIA/Text → Video

NVIDIA's Cosmos 3 Edge Brings World Models Closer

A new edge-optimized variant of NVIDIA's Cosmos world-model line aims to run generative video where the compute lives.

Jul 1, 2026

From Image to Interactive World

The model's core capabilities focus on two distinct but related tasks:

Scene Generation: Creating extended video sequences that maintain temporal and spatial consistency, based on an initial image.

3D Reconstruction: Inferring the geometry and texture of the scene to build a 3D model from the 2D input.