Google logo
Google

Gemini 2.0 Flash-Lite (Feb '25)

Released Feb 2025

Gemini 2.0 Flash-Lite is a multimodal large language model developed by Google, specifically optimized for high-efficiency and low-latency performance. Introduced as a cost-effective variant within the Gemini 2.0 family, it is designed to handle high-volume, frequent tasks at scale while maintaining a balance between response speed and output quality. The model features a 1 million token context window, allowing it to process and reason across extensive datasets including text, images, and audio inputs. According to Google, Gemini 2.0 Flash-Lite delivers improved quality over its predecessor, Gemini 1.5 Flash, at a comparable price point and speed. Its architecture is specifically tailored for developers requiring high throughput for latency-sensitive applications, such as rapid summarization and content generation, and it has a reported knowledge cutoff of June 2024.

Rankings & Comparison