FIBO Edit is an 8-billion-parameter image-to-image foundation model developed by Bria AI, designed for high-precision, instruction-based editing. It is a specialized variant within the FIBO (Foundation for Intelligent Business Operations) family and introduces a JSON-native architecture built on the Visual GenAI Language (VGL). This paradigm replaces ambiguous natural language prompts with structured schemas to ensure deterministic results and prevent the "prompt drift" common in traditional diffusion models.
The model is based on a Diffusion Transformer (DiT) architecture featuring flow-matching. It integrates a SmolLM3-3B text encoder with a specialized DimFusion conditioning layer to manage complex, long-form descriptions. By disentangling visual elements—such as lighting, camera angles, and object placement—FIBO Edit allows users to modify specific attributes of an image while preserving the structural integrity and aesthetic consistency of the original scene.
Key Capabilities and Training
FIBO Edit supports native masking for localized modifications, enabling pixel-perfect edits optimized for professional production environments. The model operates in three primary modes: Generate, which expands short ideas into structured JSON instructions; Refine, for iterative adjustments to existing images without unintended cascading changes; and Inspire, which extracts detailed prompts from reference images to facilitate stylistic variations.
In alignment with enterprise standards for legal safety, FIBO Edit was trained exclusively on a dataset of 100% licensed and rights-cleared content. This training utilizes structured JSON captions of up to 1,000 words, encoding over 100 visual attributes. This approach provides granular control over the final output and ensures the model is compliant with intellectual property regulations for commercial deployment.