Nanbeige Releases 3B Chinese-Enhanced Language Model
The new Llama-based model was trained from scratch on 3.5 trillion tokens of Chinese and English data to enhance its bilingual capabilities.

Chinese AI firm Nanbeige Technology has released Nanbeige4.1-3B, a new 3-billion-parameter language model designed for strong performance in both Chinese and English. The model is based on the popular Llama architecture but was trained from scratch on a custom, high-quality dataset of 3.5 trillion tokens.
This release adds a new contender to the growing field of specialized, efficient open models. By focusing on a specific language pair and a compact size, Nanbeige4.1-3B offers developers a capable option that can run on less powerful hardware. The company's provided benchmarks show it performing competitively against other models in its class on Chinese-centric evaluations like CMMLU and C-Eval.
Key Specifications
The model's technical details make it a practical choice for a range of applications:
- Architecture: Llama
- Parameters: 3 billion
- Context Length: 4096 tokens
- Training Data: 3.5T tokens (Chinese & English)
According to its Hugging Face release card, Nanbeige4.1-3B is available under a custom license that permits free commercial use, an important consideration for teams looking to build products with the model.
Sources
- Visit
Nanbeige/Nanbeige4.1-3B
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Text / LLM

Zhipu AI Releases MIT-Licensed GLM-5.2 MoE Model
The new bilingual model from the Chinese AI firm uses a Mixture of Experts architecture and sparse attention under a fully permissive license.

Weibo AI Releases VibeThinker-3B, a Compact Reasoning Model
The new 3-billion-parameter model from the Chinese tech giant focuses on challenging benchmarks in mathematics, coding, and graduate-level questions.
Moonshot AI Releases Kimi, a Multimodal Coding Model
The new Mixture-of-Experts model from the Chinese AI company can generate code while also understanding visual inputs, a rare combination in open models.