Mistral logo
Mistral
Open Weights

Mixtral 8x22B Instruct

Released Apr 2024

Intelligence
#394
Context65K
Parameters141B

Mixtral 8x22B Instruct is a large-scale sparse Mixture-of-Experts (SMoE) model developed by Mistral AI. It is an instruction-tuned version of the base Mixtral 8x22B model, designed for high-performance reasoning, multilingual tasks, and code generation.

Architecture and Performance

The model features 141 billion total parameters, with approximately 39 billion active parameters during inference. This sparse architecture allows it to maintain high computational efficiency while leveraging a large knowledge base. It supports a context window of 64,000 tokens, enabling the processing of long documents and complex, multi-turn dialogues.

Capabilities

The model is optimized for several European languages, including English, French, Italian, German, and Spanish. It exhibits strong performance in technical domains such as mathematics and programming. The Instruct version is specifically fine-tuned using supervised fine-tuning and direct preference optimization (DPO) to follow complex user instructions with high precision and maintain a neutral conversational tone.

Rankings & Comparison