MiniMaxText / LLM

MiniMax Debuts M2.1, an MoE Model Optimized with FP8

The new Mixture of Experts model from the Chinese AI firm uses 8-bit floating-point precision for a smaller memory footprint and faster inference.

Dec 20, 2025

UpdateOther

AI research company MiniMax has released MiniMax-M2.1, a new large language model built on a Mixture of Experts (MoE) architecture. Unlike many contemporary models that use 16-bit precision, M2.1 is notable for its use of FP8 weights, a strategic choice aimed at significantly boosting computational efficiency.

Precision and Performance

The adoption of 8-bit floating-point (FP8) precision is a key trend in making large models more practical to deploy. By representing model weights with fewer bits, FP8 reduces the model's memory footprint and can accelerate inference speed on compatible hardware. This approach allows powerful MoE models, which are often very large, to run on less demanding infrastructure, though it can introduce a trade-off with numerical precision.

As a Mixture of Experts model, M2.1 is designed for efficiency at its core. MoE architectures process inputs by routing them to specialized "expert" subnetworks, activating only a fraction of the model's total parameters for any given task. This sparse activation, combined with the FP8 data type, positions M2.1 as a model focused on efficient scaling.

The model is available for download and experimentation on the Hugging Face Hub, but it comes with a custom license. The terms restrict usage to non-commercial and research purposes only, prohibiting commercial deployment and the redistribution of the model or any derivatives.

Sources

MiniMaxAI/MiniMax-M2.1
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

Meituan Ships a Lighter, Sparser LongCat-Flash

The food-delivery giant's newest open model trims its mixture-of-experts design for more efficient inference under an MIT license.

Jul 31, 2026

DeepSeek/Text / LLM

DeepSeek Refreshes V4-Flash With New 0731 Checkpoint

The MIT-licensed mixture-of-experts model returns in an updated build shipping with FP8 weights for cheaper inference.

Jul 31, 2026

DeepSeek/Text / LLM

DeepSeek Ships V4-Flash, a 304B MoE Tuned for Agents

The latest checkpoint in DeepSeek's V4 line leans into agentic workflows while keeping the permissive MIT license.

Jul 31, 2026

Precision and Performance