FLUX.2 [dev] is a 32-billion parameter rectified flow transformer model developed by Black Forest Labs for high-fidelity image generation and editing. As the open-weight successor to the FLUX.1 series, it is distilled for efficient performance while maintaining frontier-level capabilities in prompt adherence and visual realism. The model is released under a non-commercial license to support research and independent creative development.

Technical Architecture

The model's architecture combines a Mistral-3 24B vision-language model with a latent flow matching transformer. This hybrid approach allows the system to utilize deep contextual understanding for complex prompt interpretation while precisely managing spatial relationships, material textures, and compositional logic. It is capable of generating and editing images at resolutions up to 4 megapixels across a variety of aspect ratios.

Key Capabilities

FLUX.2 introduces native multi-reference support, enabling the model to incorporate up to 10 reference images to maintain consistent characters, styles, or objects across different outputs without the need for specialized fine-tuning. It demonstrates significant improvements in human anatomy, photorealistic lighting, and complex typography rendering, allowing for clear and legible text integration within generated scenes.

Prompting and Optimization

The model is optimized for descriptive, prose-like prompts rather than simple keyword sequences. For accurate text rendering, specific phrases should be enclosed in quotation marks. Due to its 32B parameter size, the full weights require substantial VRAM; however, the model is compatible with quantization methods like FP8 and NVFP4, which facilitate local execution on consumer-grade hardware.

Rankings & Comparison