Benchmarks
How open-weight models stack up — across reasoning, code, vision, and image generation. Cached from public sources and refreshed every 12 hours.
Artificial Analysis Intelligence Index — a blended measure of reasoning, knowledge, and coding across open-weight and proprietary language models.
Human-preference Elo for text-to-image models from the Artificial Analysis Arena — open-weight and proprietary, ranked head-to-head.
| # | Model | Elo |
|---|---|---|
| 1 | HiDream-O1-Image-Dev-2604 HiDream | 1,189 |
| 2 | ERNIE Image Baidu | 1,168 |
| 3 | ERNIE Image Turbo Baidu | 1,160 |
| 4 | FLUX.2 [dev] Black Forest Labs | 1,159 |
| 5 | FLUX.2 [dev] Turbo Fal | 1,156 |
Human-preference Elo for instruction-based image editing models, ranked head-to-head in the Artificial Analysis Arena.
| # | Model | Elo |
|---|---|---|
| 1 | HunyuanImage 3.0 Instruct Tencent | 1,230 |
| 2 | HiDream-O1-Image HiDream | 1,210 |
| 3 | FLUX.2 [klein] 9B Black Forest Labs | 1,168 |
| 4 | FLUX.2 [dev] Turbo Fal | 1,157 |
| 5 | HiDream-O1-Image-Dev HiDream | 1,150 |
Human-preference Elo for text-to-video models from the Artificial Analysis Arena — the fastest-moving open-vs-proprietary race in generative AI.
| # | Model | Elo |
|---|---|---|
| 1 | LTX-2.3 Fast Lightricks | 1,130 |
| 2 | LTX-2 Fast Lightricks | 1,128 |
| 3 | Wan 2.2 A14B Qwen · Alibaba | 1,116 |
| 4 | HunyuanVideo-1.5 Tencent | 1,027 |
| 5 | Wan 2.1 14B Qwen · Alibaba | 1,021 |
Human-preference Elo for image-to-video models, ranked head-to-head in the Artificial Analysis Arena.
| # | Model | Elo |
|---|---|---|
| 1 | LTX-2 Fast Lightricks | 1,179 |
| 2 | LTX-2.3 Fast Lightricks | 1,161 |
| 3 | HunyuanVideo-1.5 Tencent | 1,122 |
| 4 | Wan 2.2 A14B Qwen · Alibaba | 1,107 |
| 5 | LTX Video v0.9.7 13B Lightricks | 1,040 |
Human-preference Elo for text-to-speech models from the Artificial Analysis Arena — open-weight and proprietary voices, ranked head-to-head.
| # | Model | Elo |
|---|---|---|
| 1 | Kokoro 82M v1.0 Kokoro | 1,067 |
| 2 | Maya1 Maya Research | 1,057 |
| 3 | Fish Speech 1.5 Fish Audio | 1,013 |
| 4 | Chatterbox Resemble AI | 1,006 |
| 5 | Zonos-v0.1 Zyphra | 1,000 |
Aggregate of contamination-resistant benchmarks — IFEval, BBH, MATH, GPQA, MUSR, MMLU-Pro — for open-weight language models. The final snapshot of Hugging Face's Open LLM Leaderboard.
| # | Model | Avg. |
|---|---|---|
| 1 | MaziyarPanahi/calme-3.2-instruct-78b | 52.1 |
| 2 | MaziyarPanahi/calme-3.1-instruct-78b | 51.3 |
| 3 | dfurman/CalmeRys-78B-Orpo-v0.1 | 51.2 |
| 4 | MaziyarPanahi/calme-2.4-rys-78b | 50.8 |
| 5 | huihui-ai/Qwen2.5-72B-Instruct-abliterated | 48.1 |
The most-downloaded open-weight models on Hugging Face over the last 30 days — the community's working set, straight from the Hub and refreshed every 12 hours.
| # | Model | Downloads |
|---|---|---|
| 1 | sentence-transformers/all-MiniLM-L6-v2 | 185.4M |
| 2 | cross-encoder/ms-marco-MiniLM-L6-v2 | 58.3M |
| 3 | google-bert/bert-base-uncased | 45.3M |
| 4 | BAAI/bge-small-en-v1.5 | 44.3M |
| 5 | sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | 37.2M |
The most-liked open-weight models on Hugging Face — a durable signal of community esteem spanning every modality, refreshed every 12 hours.
| # | Model | Likes |
|---|---|---|
| 1 | deepseek-ai/DeepSeek-R1 DeepSeek | 13.4K |
| 2 | black-forest-labs/FLUX.1-dev Black Forest Labs | 13.2K |
| 3 | stabilityai/stable-diffusion-xl-base-1.0 Stability AI | 7.8K |
| 4 | CompVis/stable-diffusion-v1-4 | 7K |
| 5 | meta-llama/Meta-Llama-3-8B Meta AI | 6.6K |