Allen AI logo
Allen AI
Open Weights

OLMo 2 32B

Released Mar 2025

OLMo 2 32B is a 32-billion-parameter large language model developed by the Allen Institute for AI (Ai2). As the largest iteration in the OLMo 2 family, it is designed with a focus on radical transparency, providing full access to training data, code, weights, and intermediate checkpoints. The model follows a Transformer-based autoregressive architecture and was trained on approximately 6 trillion tokens.

The development process utilized a phased training approach comprising pre-training, mid-training, and post-training stages. The post-training phase integrated Tülu 3.1 and Reinforcement Learning with Verifiable Rewards (RLVR), specifically employing Group Relative Policy Optimization (GRPO). This methodology aims to enhance instruction-following capabilities and performance in domains such as mathematics and reasoning.

In academic evaluations, OLMo 2 32B demonstrates competitive performance against both proprietary and open-weight models. It is reported to match or surpass models like GPT-3.5 Turbo and GPT-4o mini on various multi-skill benchmarks while requiring significantly less training compute than comparable open-weight alternatives. The model was trained using the OLMo-core framework on Google's Augusta AI Hypercomputer, optimized for computational efficiency and research reproducibility.

Rankings & Comparison