Tencent's HY-WorldPlay Creates 3D Scenes from One Image
The new model from Tencent's Hunyuan team generates dynamic video and reconstructs 3D environments using a single static picture.
Tencent has introduced HY-WorldPlay, an AI model designed to generate interactive video and 3D assets from a single image. As part of the company's broader Hunyuan AI initiative, this release pushes beyond simple video generation into the realm of creating dynamic, explorable digital scenes.
Unlike many image-to-video models that create short, fixed animations, HY-WorldPlay functions as an interactive world model. It interprets a static image to build a coherent 3D environment, which can then be used to generate long, consistent videos. Users can also guide the camera and action with text prompts, effectively "playing" within the newly created world.
From Image to Interactive World
The model's core capabilities focus on two distinct but related tasks:
- Scene Generation: Creating extended video sequences that maintain temporal and spatial consistency, based on an initial image.
- 3D Reconstruction: Inferring the geometry and texture of the scene to build a 3D model from the 2D input.
This technology points toward more sophisticated generative tools for creating content for games, simulations, and virtual reality. By building a foundational understanding of a scene's 3D structure, models like HY-WorldPlay can produce more dynamic and controllable outputs than those that simply manipulate pixels.
The model is now available for download on Hugging Face, though its use is restricted. It's governed by a custom license agreement that limits its application to non-commercial research purposes, a common approach for foundational model releases.
Sources
- Visit
tencent/HY-WorldPlay
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Image → Video

Zhipu AI Releases SCAIL-2 for Character Animation
The new open-source diffusion model from the company's research arm generates video clips from a single character image and a sequence of poses.

NVIDIA Releases Cosmos3 Image-to-Video World Model
The latest release in NVIDIA's 'world model' research family aims to generate coherent and realistic video from a single static image.
NVIDIA Releases SANA, a Camera-Controllable Video Model
The new model, SANA-WM, uses a bidirectional diffusion process to give creators fine-grained control over camera movement and video editing.