Gemma 3n E4B Instruct Preview (May '25) by Google: LLM Benchmarks, Rankings & Specs

The Gemma 3n E4B Instruct Preview is a lightweight, multimodal model released by Google in May 2025 as part of its mobile-first AI family. Optimized for execution on low-resource hardware such as smartphones and laptops, this version is designed to provide high-performance generative capabilities on-device. It serves as an early iteration of the Gemma 3n series, specifically focusing on text and vision input processing through the LiteRT runtime.

Technically, the model features a Matryoshka Transformer (MatFormer) architecture, which allows for nested sub-models and selective parameter activation. The "E4B" designation refers to its 4 billion effective parameters, though the architecture maintains a raw count of approximately 8 billion. This design is paired with Per-Layer Embedding (PLE) caching, a technique that offloads specific embedding layers to the CPU to minimize the VRAM footprint, allowing the model to operate efficiently within a memory budget of approximately 3GB.

At the time of its preview release, the model supported multimodal tasks including image analysis, visual question answering, and text summarization. It was trained on a massive dataset of 11 trillion tokens, incorporating broad linguistic variety, code, and mathematics, with a knowledge cutoff of June 2024. The model is released under an open-weights license, facilitating responsible development and research in on-device AI applications.

Gemma 3n E4B Instruct Preview (May '25)

Explore AI Studio

Rankings & Comparison