Meituan Releases Open, Bilingual Image Editing Model
The new LongCat-Image-Edit model follows natural language instructions to perform complex photo manipulations in both English and Chinese.

Chinese technology company Meituan has released LongCat-Image-Edit, an open-source model designed for instruction-based image editing. Released under a permissive Apache 2.0 license, the model provides a new tool for developers and creators working on generative AI applications.
Unlike traditional text-to-image models that generate images from scratch, LongCat-Image-Edit modifies existing images based on specific user commands. A key feature is its bilingual capability, allowing it to understand and execute instructions in both English and Chinese, making it accessible to a wider global audience.
Precise, Instruction-Based Control
The model excels at a range of common editing tasks by interpreting natural language. This approach allows for more intuitive control than writing complex prompts or using masking tools. According to the project's documentation, LongCat can handle:
- Local Editing: Changing the attributes of a specific object, like "change the color of the car to red."
- Style Modification: Altering the overall aesthetic, such as applying a "watercolor style."
- Global Replacement: Swapping out major elements, like changing the background from a city to a forest.
The model weights and usage instructions are now available for download from the Hugging Face Hub. This release adds another powerful, open tool to the growing ecosystem for AI-powered creative work, particularly for tasks requiring precise, user-guided manipulation rather than pure generation.
Sources
- Visit
meituan-longcat/LongCat-Image-Edit
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Image Editing

ByteDance Releases Lance, a Unified Generative AI Model
The 3-billion-parameter model handles image and video generation, editing, and understanding from a single set of weights under a permissive license.

SenseTime Releases 8B 'Any-to-Any' Infographic Model
The new 8B-parameter SenseNova U1 model from SenseTime is designed for complex multimodal tasks, including the in-conversation generation and editing of infographics.

NVIDIA Releases PiD for High-Quality Image Upscaling
The new component is a specialized VAE decoder that works with Stability AI's Z-Image model to enhance super-resolution tasks.