DeepSeekText / LLM

DeepSeek Releases 671B MoE Model Under MIT License

The new DeepSeek-V3.1-Base is a massive 671-billion-parameter Mixture-of-Experts model designed for efficient, large-scale research and development.

Aug 19, 2025

Major releaseMIT

AI research firm DeepSeek has released DeepSeek-V3.1-Base, a powerful new foundation model that significantly expands the top tier of open-source AI. With a total of 671 billion parameters, it is one of the largest and most capable base models made available to the public under a permissive license.

The model's architecture is a Mixture-of-Experts (MoE), a design that allows for massive parameter counts while managing computational costs. Instead of activating all 671 billion parameters for every task, an MoE model intelligently routes inputs to specialized "expert" subnetworks, making training and inference more efficient than a dense model of equivalent size. The official model card also notes the use of FP8 weights, a lower-precision format that further improves performance and reduces memory requirements.

Why it matters

The release of a model of this scale under the highly permissive MIT license is a major contribution to the open-source community. It provides researchers and developers with a powerful, commercially viable foundation for building specialized applications without the restrictive licensing often attached to state-of-the-art models. This gives organizations a new, high-quality starting point for fine-tuning on proprietary data for complex reasoning and generation tasks.

As a "base" model, DeepSeek-V3.1 is not intended for direct use as a chatbot but is instead optimized for further training and adaptation. Developers can access the model and its components directly from its Hugging Face repository. Its release signals a continuing trend of top-tier AI capabilities becoming more accessible, fostering broader innovation in the field.

Sources

deepseek-ai/DeepSeek-V3.1-Base
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

Meituan Ships a Lighter, Sparser LongCat-Flash

The food-delivery giant's newest open model trims its mixture-of-experts design for more efficient inference under an MIT license.

Jul 31, 2026

DeepSeek/Text / LLM

DeepSeek Refreshes V4-Flash With New 0731 Checkpoint

The MIT-licensed mixture-of-experts model returns in an updated build shipping with FP8 weights for cheaper inference.

Jul 31, 2026

DeepSeek/Text / LLM

DeepSeek Ships V4-Flash, a 304B MoE Tuned for Agents

The latest checkpoint in DeepSeek's V4 line leans into agentic workflows while keeping the permissive MIT license.

Jul 31, 2026

Why it matters