MiniMax logo
MiniMax
Open Weights

MiniMax M1 80k

Released Jun 2025

Intelligence
#166
Coding
#214
Math
#116
Context1M
Parameters456B

MiniMax M1 80k is an open-weight, large-scale reasoning model developed by MiniMax. It utilizes a hybrid Mixture-of-Experts (MoE) architecture integrated with Lightning Attention, a mechanism providing linear computational complexity to enable the efficient processing of extremely long sequences. The model contains a total of 456 billion parameters, with approximately 45.9 billion parameters activated per token during inference.

A defining feature of the M1 series is its support for a 1 million token input context window. The "80k" designation refers to the model's reasoning output budget, which allows it to generate up to 80,000 tokens of internal chain-of-thought or thinking output in a single session. This capacity is intended to facilitate complex problem decomposition and long-form agentic workflows in tasks such as software engineering and large-scale document analysis.

MiniMax M1 was trained using a reinforcement learning algorithm known as CISPO (Clips Importance Sampling Policy Optimization), which stabilizes training and accelerates convergence in hybrid architectures. Released under the Apache 2.0 license, the model is designed to achieve high-efficiency performance on reasoning-heavy benchmarks while utilizing significantly fewer computational resources than standard Transformer models at long sequence lengths.

Rankings & Comparison