GPT-5.4 nano is a lightweight, cost-optimized language model developed by OpenAI and released on March 17, 2026. It is designed for high-volume, low-latency applications where speed and efficiency are prioritized over deep, multi-step reasoning. As the smallest variant in the GPT-5.4 family, it is primarily intended for 'sub-agent' roles within larger AI systems, handling tasks such as classification, data extraction, and structured output generation.
The model features a 400,000-token context window and a maximum output limit of 128,000 tokens, supporting both text and image inputs. Despite its compact architecture, it demonstrates significant technical capability, scoring 52.4% on the SWE-Bench Pro software engineering benchmark. With a knowledge cutoff of August 2025, it maintains strong instruction-following performance and reliability in tool-calling scenarios.
Optimized for rapid throughput, GPT-5.4 nano can reach speeds of approximately 200 tokens per second. Its pricing structure is tailored for massive-scale API deployment, making it a viable solution for background processing and real-time interactive systems that require minimal per-token costs. It is often used in tandem with larger models like GPT-5.4 Pro, where it executes focused sub-tasks delegated by more powerful reasoning-focused variants.