FLUX1.1 [pro] Ultra is a high-resolution text-to-image generation model developed by Black Forest Labs. Building on the architecture of the FLUX1.1 [pro] series, the "Ultra" update is designed to produce images at up to 4-megapixel resolution (e.g., 2048×2048 pixels) while maintaining high generation speeds. It is optimized for professional workflows that require both extreme detail and fast turnaround times, typically delivering 4K-scale results in approximately 10 seconds.
The model introduces two distinct operational modes: Ultra and Raw. Ultra mode is engineered for maximum composition precision and high-definition clarity, making it suitable for complex graphic design and commercial art. Raw mode is specifically tuned to capture the authentic feel of candid photography, prioritizing natural textures, reduced synthetic artifacts, and diverse human subjects to achieve a hyper-realistic aesthetic.
Architecture and Performance
Technically, the model utilizes a hybrid architecture featuring multimodal and parallel diffusion transformer modules. It is built on a 12-billion parameter scale and employs flow matching, an advanced training methodology that generalizes traditional diffusion techniques. The integration of rotary positional embeddings and parallel attention mechanisms allows the model to handle high-resolution data efficiently without the quality degradation common in upscaled outputs.
Prompting and Capabilities
FLUX1.1 [pro] Ultra demonstrates high adherence to complex, natural language prompts and supports a variety of aspect ratios (including 16:9 and 21:9). Its design focuses on maintaining prompt fidelity at higher pixel counts, ensuring that specific details described in the text are accurately represented across the entire image. The model also supports Image-to-Image workflows, allowing users to use existing visuals as structural or stylistic context.