Allen AI logo
Allen AI
Open Weights

Llama 3.1 Tulu3 405B

Released Jan 2025

Intelligence
#311
Arena AI
#175
Context128K
Parameters405B

Llama 3.1 Tulu 3 405B is a large-scale language model developed by the Allen Institute for AI (AI2), representing the application of the Tulu 3 post-training recipe to the 405-billion parameter Llama 3.1 base model. It is part of an initiative to provide a fully open-source pipeline for high-performance instruction-following models, including transparent training data, code, and methodologies.

The model utilizes a multi-stage post-training process that includes Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and a specialized technique called Reinforcement Learning from Verifiable Rewards (RLVR). RLVR is specifically designed to improve performance in domains with objective ground truths, such as mathematics and programming, by providing feedback based on verifiable outcomes rather than human preference alone.

In terms of capabilities, Tulu 3 405B is optimized for complex reasoning, mathematical problem solving (measured by benchmarks like MATH and GSM8K), and strict adherence to formatting instructions (IFEval). It is released under the Llama 3.1 Community License, and AI2 provides the Tulu 3 Mix dataset to the research community to encourage further study into large-scale post-training techniques.

Rankings & Comparison