MiniMax-M2 is an open-weight large language model developed by the Shanghai-based AI startup MiniMax. Released in October 2025, the model is specifically optimized for coding and agentic workflows, aiming to provide frontier-level performance while maintaining high inference speed and cost efficiency. It utilizes a sparse Mixture of Experts (MoE) architecture with a total of 230 billion parameters, of which only 10 billion are active during inference for any single token.
A defining feature of MiniMax-M2 is its "interleaved thinking" capability, which allows the model to generate and follow persistent reasoning traces to solve complex, multi-step problems. This design is particularly effective for autonomous agents, enabling the model to manage long-horizon toolchains across environments such as shells, web browsers, and code interpreters. It demonstrates strong performance in software engineering benchmarks, including multi-file repo editing and automated bug fixing.
The model supports a substantial context window of 196,608 tokens, facilitating the processing of large codebases and extensive documentation. MiniMax-M2 was released under a modified MIT license, permitting broad usage and local deployment of its weights.