Ministral 8B is a compact large language model (LLM) released by Mistral AI in October 2024. Part of the les Ministraux model family, it is specifically optimized for edge computing, on-device deployment, and low-latency applications. It was designed to provide advanced reasoning and knowledge retrieval capabilities within a sub-10 billion parameter footprint, serving as a high-performance successor to the original Mistral 7B.
The model features a 128,000-token context window, allowing it to process large documents and manage extended multi-turn conversations. Its architecture utilizes an interleaved sliding-window attention mechanism, which is intended to improve inference speed and memory efficiency. This makes the model particularly suited for local analytics, internet-less smart assistants, and autonomous robotics where privacy and immediate response times are critical.
Ministral 8B is trained on a significant proportion of multilingual and code-centric data. It supports native function calling, enabling it to act as an intermediary in complex agentic workflows by routing tasks, parsing structured data, and interacting with external APIs. At release, Mistral AI reported that the model outperformed several contemporary models of similar size, including Llama 3.1 8B and Gemma 2 9B, across various benchmarks such as MMLU and GSM8K.