Liquid AI logo
Liquid AI
Open Weights

LFM2 8B A1B

Released Oct 2025

LFM2-8B-A1B is a sparse Mixture-of-Experts (MoE) language model developed by Liquid AI, designed specifically for efficient on-device and edge AI deployment. As part of the Liquid Foundation Model 2 (LFM2) series, it aims to balance high output quality with low-latency performance. The model contains 8.3 billion total parameters, but utilizes a sparse execution path that activates only 1.5 billion parameters per token, allowing it to run with the computational requirements of a much smaller dense model while aiming for the quality typically associated with 3-4B parameter dense models.

Architecture and Design

The model's architecture is built on a hybrid backbone that integrates 18 gated short-convolution blocks and 6 grouped-query attention (GQA) blocks. This design is optimized through hardware-in-the-loop architecture search to maximize efficiency on consumer CPUs and NPUs. Unlike standard Transformer architectures, the LFM2 backbone uses convolutions to handle local patterns, which reduces memory and compute costs. The MoE implementation uses 32 experts per layer (excluding the first two layers, which remain dense for stability) and selects the top-4 experts for each token using a normalized-sigmoid gating mechanism.

Capabilities and Training

LFM2-8B-A1B was pre-trained on a dataset of 12 trillion tokens, including English, code, and multilingual data. It supports a context window of 32,768 tokens and is capable of processing eight major languages, including English, Chinese, French, and Spanish. The model is particularly suited for agentic tasks, data extraction, and multi-turn conversations, though it is primarily optimized for low-latency assistant applications on local hardware. It is released under the LFM Open License v1.0, supporting academic research and commercial use for companies under a specific revenue threshold.

Rankings & Comparison