Zhipu AI Releases Open, Bilingual GLM-Image Model
The new text-to-image model is fluent in both Chinese and English, built on the CogView2 architecture and released under a permissive MIT license.

Zhipu AI, a prominent Beijing-based AI research company, has released GLM-Image, a new open-source model for generating images from text descriptions. The model's primary distinction is its native bilingual capability, fluently understanding prompts in both Chinese and English.
This release is a notable contribution to the open-source multimodal landscape, which has historically been dominated by English-centric models. By providing a powerful bilingual tool under a permissive MIT license, Zhipu AI is lowering the barrier for developers and researchers worldwide to build applications that serve a more linguistically diverse audience.
Technical Foundations
GLM-Image is a diffusion model built upon the architecture of CogView2, an earlier powerful text-to-image model from the same research lineage. It works by pairing a Transformer-based text encoder with the diffusion model that synthesizes the final image.
The model and its weights are available for download from the project's Hugging Face repository. This move continues Zhipu AI's pattern of contributing significant models to the open-source community, fostering further innovation in multimodal AI research.
Sources
- Visit
zai-org/GLM-Image
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Text → Image

Ideogram 4.0: A 9.3B Open-Weight Text-to-Image Model
The new 9.3 billion parameter model uses a Diffusion Transformer architecture and excels at rendering coherent text within generated images.

ByteDance Releases Lance, a Unified Generative AI Model
The 3-billion-parameter model handles image and video generation, editing, and understanding from a single set of weights under a permissive license.

SenseTime Releases 8B 'Any-to-Any' Infographic Model
The new 8B-parameter SenseNova U1 model from SenseTime is designed for complex multimodal tasks, including the in-conversation generation and editing of infographics.