Qwen Releases Open-Source Instruction-Based Image Editor
The new model from Alibaba's Qwen team allows users to modify images using natural language prompts instead of complex tools or masks.
Alibaba's Qwen team has expanded its open-source portfolio with the release of Qwen-Image-Edit, a model designed for instruction-based image editing. Made available under a permissive Apache 2.0 license, the model allows users to alter specific parts of an image by providing simple text commands. The model card and weights can be found on its Hugging Face repository.
Unlike text-to-image generation models that create visuals from scratch, Qwen-Image-Edit takes an existing image and a set of instructions as input. This approach enables more controlled and targeted modifications, such as changing an object's color, adding an element to a scene, or altering a background, without requiring manual selection tools or complex software.
This release provides developers with a powerful open-source foundation for building more intuitive creative tools. By translating natural language directly into pixel-level changes, such models lower the barrier for advanced photo editing. This capability could be integrated into consumer applications, professional design software, or specialized commercial platforms.
Qwen-Image-Edit joins a growing family of multimodal models from the Qwen team, underscoring their continued investment in the open-source AI ecosystem. By providing powerful and accessible tools for both text and vision, they are helping to democratize capabilities that were once exclusive to proprietary systems.
Sources
- Visit
Qwen/Qwen-Image-Edit-2509
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Image Editing

ByteDance Releases Lance, a Unified Generative AI Model
The 3-billion-parameter model handles image and video generation, editing, and understanding from a single set of weights under a permissive license.

SenseTime Releases 8B 'Any-to-Any' Infographic Model
The new 8B-parameter SenseNova U1 model from SenseTime is designed for complex multimodal tasks, including the in-conversation generation and editing of infographics.

NVIDIA Releases PiD for High-Quality Image Upscaling
The new component is a specialized VAE decoder that works with Stability AI's Z-Image model to enhance super-resolution tasks.