Qwen Releases Open, Bilingual Image Editing Model
The new diffusion model from Alibaba's team allows for precise, instruction-based image modifications in both English and Chinese.
Alibaba's Qwen team has expanded its open-source toolkit with the release of Qwen-Image-Edit, a new diffusion model specialized in modifying images based on text commands. Unlike models that generate pictures from scratch, this tool focuses on refining and altering existing visuals with a higher degree of control.
The model's standout feature is its ability to understand instructions in both English and Chinese, making it accessible to a broad global community of developers and creators. It allows users to make specific, targeted changes—like adding an object or changing a background—by describing the desired outcome in natural language.
Why It Matters
Instruction-based image editing is a significant step for practical creative workflows. It offers a more direct and efficient way to use generative AI, moving beyond simple image creation to sophisticated post-production and enhancement tasks. By releasing the model under a permissive Apache 2.0 license, the Qwen team is enabling developers to freely build these advanced capabilities into their own commercial applications.
The model and its technical details are now available for download and experimentation on the Hugging Face Hub. Its release provides another powerful building block for the open-source creative AI ecosystem.
Sources
0 comments
No comments yet. Be the first to weigh in.
More in Image Editing

ByteDance Releases Lance, a Unified Generative AI Model
The 3-billion-parameter model handles image and video generation, editing, and understanding from a single set of weights under a permissive license.

SenseTime Releases 8B 'Any-to-Any' Infographic Model
The new 8B-parameter SenseNova U1 model from SenseTime is designed for complex multimodal tasks, including the in-conversation generation and editing of infographics.

NVIDIA Releases PiD for High-Quality Image Upscaling
The new component is a specialized VAE decoder that works with Stability AI's Z-Image model to enhance super-resolution tasks.