Black Forest LabsText → Image

Black Forest Labs Releases 9B FLUX.2 Image Model

The new text-to-image model emphasizes speed and efficiency with a novel architecture and FP8 quantization.

Jan 14, 2026

NotableOther

German research company Black Forest Labs has released FLUX.2-klein-base, a new 9-billion-parameter model for text-to-image generation and editing. The release marks the debut of the FLUX.2 family, which introduces a new architectural approach designed for high-speed inference.

Unlike popular latent diffusion models such as Stable Diffusion that rely on a U-Net, FLUX.2 uses a multi-stage process built on transformers. This design, combined with its native FP8 precision, aims to deliver faster image generation on consumer-grade hardware. The "klein" (German for 'small') designation suggests this 9B model is an efficient entry point into a new class of powerful image generators.

How it Works

The model's architecture is composed of two main transformer components:

A large, text-guided transformer that processes prompts and generates a base 128x128 image.
A smaller, specialized upscaler transformer that refines the initial output into a final 1024x1024 image.

This two-part system is designed to create detailed images while maintaining performance. The model weights and usage examples are available on the official Hugging Face repository.

While the model's weights are publicly accessible, they are governed by a custom license that prohibits commercial use and places other restrictions on redistribution and training. Developers and researchers should review the terms carefully before integrating the model into their work.

Sources

black-forest-labs/FLUX.2-klein-base-9b-fp8
Hugging Face
Visit

0 comments

No comments yet. Be the first to weigh in.

Microsoft's Mage-Flow packs image editing into 4B

A compact model handles both text-to-image generation and instruction-based edits at native resolution, under a permissive MIT license.

Jul 21, 2026

Unknown/Any-to-Any

Boogu-Image-0.1 Brings Unified Multimodal to Open Source

A new Apache-licensed model family folds bilingual text-to-image generation and instruction editing into one system.

Jul 13, 2026

NVIDIA/Text → Image

NVIDIA distills Qwen-Image for few-step generation

A DMD2-distilled build of Qwen-Image trades sampling steps for speed while keeping the original model's output profile.

Jul 1, 2026

How it Works

The model's architecture is composed of two main transformer components:

A large, text-guided transformer that processes prompts and generates a base 128x128 image.

A smaller, specialized upscaler transformer that refines the initial output into a final 1024x1024 image.

This two-part system is designed to create detailed images while maintaining performance. The model weights and usage examples are available on the official Hugging Face repository.