GLM-5 is a flagship large language model developed by Zhipu AI (Z.ai), released in February 2026. Built on a Mixture-of-Experts (MoE) architecture, the model contains approximately 744 billion total parameters, with roughly 40 billion active during inference. It is specifically engineered for "Agentic Engineering," a paradigm focused on autonomous planning, multi-step reasoning, and reliable execution in complex software and systems engineering tasks.
The model introduces a specialized Thinking Mode that utilizes interleaved thinking tokens to improve performance on high-density logic and mathematical problems. It incorporates DeepSeek Sparse Attention (DSA) to manage its 200,000-token context window efficiently, significantly reducing memory overhead compared to previous generations. On industry benchmarks, GLM-5 demonstrated competitive results in coding and logical reasoning, scoring 77.8% on SWE-bench Verified.
A significant technical milestone for the model is its training infrastructure; GLM-5 was trained entirely on Huawei Ascend chips using the MindSpore framework, achieving frontier-level performance independently of NVIDIA hardware. It was post-trained using a proprietary asynchronous reinforcement learning engine known as Slime to enhance training throughput and hallucination control. The model is released under the MIT license, supporting both commercial and research use.