FLUX.2 [klein] 4B is a compact, high-performance image generation model developed by Black Forest Labs. Built on a Rectified Flow Transformer architecture, it is designed for sub-second inference and interactive creative workflows. Despite its 4-billion parameter scale, the model maintains high visual quality, photorealism, and advanced prompt adherence, specifically in rendering complex spatial relationships and readable typography.

The model utilizes a unified architecture that merges text-to-image generation and image editing capabilities into a single system. It supports standard text prompts as well as single-reference and multi-reference image inputs, allowing for complex edits, outpainting, and style transfers without model switching. It is capable of following precise instructions, including the use of hex color codes for exact color matching in generated outputs.

Architecture and Variants

The 4B variant is released under an Apache 2.0 license, making it accessible for commercial use and community fine-tuning. It is optimized for efficiency, requiring approximately 13GB of VRAM in its base form, which allows it to run on consumer-grade hardware. The model is typically available in two primary versions: a Base (undistilled) model optimized for fine-tuning and research, and a Distilled (4-step) version designed for rapid, real-time generation.

For optimal results, the model responds best to detailed, descriptive prompts. When performing edits, providing clear natural-language instructions alongside reference images enables the model to maintain character consistency and spatial logic across iterations.

Rankings & Comparison