Allen AI logo
Allen AI
Open Weights

olmo-2-0325-32b-instruct

Released Mar 2025

Arena AI
#194
Parameters32B

OLMo 2 32B Instruct is a 32-billion parameter language model developed by the Allen Institute for AI (AI2). Released as the largest and most capable member of the OLMo 2 family, it is a fully open-source model, providing public access to its weights, training code, and the underlying data. The model is designed to support the scientific study of language models by offering a transparent development pipeline.

The model's development followed a multi-stage training process, including pre-training on 6 trillion tokens and a refined post-training phase. It utilizes the Tülu 3.1 recipe, which incorporates supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning with verifiable rewards (RLVR). This RLVR stage specifically employs Group Relative Policy Optimization (GRPO) to enhance reasoning and instruction-following capabilities on benchmarks such as GSM8K, MATH, and IFEval.

OLMo 2 32B Instruct supports a context window of 128,000 tokens and is optimized for computational efficiency. According to AI2, the model matches or exceeds the performance of proprietary models like GPT-3.5 Turbo and GPT-4o mini across various academic benchmarks. It also demonstrates competitive performance against significantly larger open-weight models, such as Llama 3.1 70B and Qwen 2.5 72B, while requiring substantially less training compute.

Rankings & Comparison