Mixtral 8x22B Instruct is a sparse Mixture-of-Experts (MoE) language model developed by Mistral AI, optimized for instruction-following and complex reasoning tasks. As a larger successor to the Mixtral 8x7B architecture, it significantly expands the parameter count while maintaining computational efficiency through sparse activation. It is released under the Apache 2.0 license.\n\n## Architecture and Performance\nThe model features 141 billion total parameters, with approximately 39 billion parameters active per token during inference. This sparse structure allows the model to achieve high performance levels while reducing the hardware requirements for generation compared to a dense model of equivalent size. It supports a context window of 65,536 tokens, facilitating the processing of extensive documents and long-range dependencies in text.\n\n## Capabilities\nThe instruction-tuned version is optimized for advanced reasoning, mathematics, and programming. It provides native support for multiple languages, specifically English, French, Italian, German, and Spanish. The model was refined using supervised fine-tuning and preference optimization techniques, and it includes native support for function calling and constrained output formats.