Google logo
Google

Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)

Released Sep 2025

Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) is an optimized variant of Google's lightweight language model family, designed to balance low-latency performance with advanced cognitive capabilities. Released as an update to the Gemini 2.5 line in September 2025, this version introduces enhanced instruction following and a "Thinking" mode that allows developers to adjust the model's reasoning budget for complex tasks. It is intended for high-throughput enterprise applications where both cost efficiency and logical reliability are required.

The model supports a 1 million-token context window and is natively multimodal, processing text, audio, images, and video. Technical refinements in the September update specifically addressed output verbosity, reducing token usage by roughly 50% compared to previous iterations to further minimize latency and operational costs. It maintains compatibility with Google’s native ecosystem tools, including code execution and grounding through Google Search.

In reasoning-enabled configurations, the model generates an internal chain of thought to improve accuracy in mathematics, programming, and multi-step agentic workflows. This "thinking" process is transparent and controllable via API, providing a middle ground between standard fast-response models and larger, compute-intensive reasoning models.

Rankings & Comparison