MBZUAI Institute of Foundation Models logo
MBZUAI Institute of Foundation Models
Open Weights

K2 Think V2

Released Feb 2025

Intelligence
#170
Coding
#207
Context262K
Parameters7B, 32B

K2-Think is a series of reasoning-focused large language models developed by the MBZUAI Institute of Foundation Models. Released as part of the institution's second-generation model ecosystem, K2-Think is specifically optimized for complex cognitive tasks such as mathematics, coding, and logical deduction. The series includes variants with 7 billion and 32 billion parameters.

The models are distinguished by their use of Reinforcement Learning from Verifiable Rewards (RLVR). This training methodology leverages deterministic feedback—such as mathematical verifiers and code compilers—to guide the model's reasoning process. By utilizing automated rewards instead of relying solely on human-annotated chain-of-thought data, the models refine their ability to perform multi-step problem solving with higher accuracy.

K2-Think is built upon the Qwen 2.5 architecture and has demonstrated competitive performance on technical benchmarks including MATH, GSM8K, and HumanEval. It is designed to minimize hallucinations in verifiable domains by prioritizing the correctness of intermediate reasoning steps throughout the inference process.

Rankings & Comparison