Aya Expanse 32B is a multilingual large language model developed by Cohere For AI (C4AI). It is part of the Aya initiative, a research project focused on expanding high-performance AI capabilities to a global audience. The model is specifically optimized for 23 languages, including Arabic, Chinese, French, Hindi, and Japanese, aiming to provide non-English language support that rivals the performance of monolingual models.
The model architecture is an auto-regressive transformer with 32 billion parameters, built upon the foundation of Cohere's Command R series. Its training process incorporates several research breakthroughs, such as data arbitrage, which balances learning across different languages, and multilingual preference training. Additionally, model merging techniques were used to combine the strengths of different fine-tuned checkpoints into a single robust model.
Aya Expanse 32B features a context window of 128,000 tokens, making it suitable for processing long documents and complex multilingual dialogues. It was released as an open-weight model under a CC-BY-NC license, intended to support the international research community in developing more inclusive language technology.