Zhipu AI Releases GLM-4.7-Flash MoE Model
The new Mixture-of-Experts model from the Beijing-based AI company is optimized for speed and released under the permissive MIT license.
Chinese AI firm Zhipu AI has released GLM-4.7-Flash, a new language model designed for high-speed inference. It employs a Mixture-of-Experts (MoE) architecture, a technique that allows models to scale up their parameter counts while keeping computational costs manageable during inference.
The 'Flash' in its name signals the model's primary goal: performance. MoE models achieve this by selectively activating only a fraction of their total parameters—the 'experts'—to process any given input. This makes them significantly faster and more efficient for real-time applications compared to dense models of a similar size.
A Permissive License for Commercial Use
Perhaps most notably for developers and businesses, GLM-4.7-Flash is available under the MIT license. This is one of the most permissive open-source licenses, imposing very few restrictions on reuse and allowing for broad commercial applications. This combination of an efficient architecture and a business-friendly license makes the model an attractive option for integration into products and services.
The model is the latest addition to Zhipu AI's GLM-4 family of models. While specific details on its total parameter count and context length have not been disclosed, developers can access the model weights and resources on its official Hugging Face repository.
Sources
- Visit
zai-org/GLM-4.7-Flash
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Text / LLM

Zhipu AI Releases MIT-Licensed GLM-5.2 MoE Model
The new bilingual model from the Chinese AI firm uses a Mixture of Experts architecture and sparse attention under a fully permissive license.

Weibo AI Releases VibeThinker-3B, a Compact Reasoning Model
The new 3-billion-parameter model from the Chinese tech giant focuses on challenging benchmarks in mathematics, coding, and graduate-level questions.
Moonshot AI Releases Kimi, a Multimodal Coding Model
The new Mixture-of-Experts model from the Chinese AI company can generate code while also understanding visual inputs, a rare combination in open models.