Aya Expanse 8B is a multilingual large language model developed by Cohere For AI as part of the Aya initiative, a global research collaboration aimed at expanding the reach of generative AI. Built on the Command family of models, it is an open-weight release with 8 billion parameters optimized for performance across 23 different languages, including Arabic, Chinese, Czech, Dutch, English, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Turkish, Ukrainian, and Vietnamese. The model is designed to narrow the performance gap between major languages and those typically underserved by mainstream AI models. The development process for Aya Expanse 8B involved several specialized research techniques, including data arbitrage, which identifies and leverages high-quality data across diverse linguistic sources. The model also utilizes multilingual preference training (RLHF/DPO) and model merging to maintain high performance in both high-resource and low-resource linguistic contexts. These breakthroughs allow the 8B parameter version to achieve competitive results on benchmarks compared to larger models in its weight class. It is capable of a variety of natural language processing tasks such as text generation, summarization, and translation while adhering to safety guidelines across its supported languages.