GLM-4.6 (Non-reasoning) by Z AI: LLM Benchmarks, Rankings & Specs

GLM-4.6 is a flagship large language model developed by Z.ai (formerly Zhipu AI), released in late 2025. It utilizes a Mixture-of-Experts (MoE) architecture with approximately 357 billion total parameters, of which 32 billion are active during any single inference step. The model is designed as a high-performance open-weight alternative to proprietary frontier models, specifically optimized for complex coding, multi-step agentic workflows, and long-context comprehension.

A central feature of GLM-4.6 is its expanded 200,000-token context window, which allows the model to process extensive datasets, including entire codebases and long-form technical documentation. This represents a significant increase from its predecessor, GLM-4.5, and is paired with a 15% improvement in token efficiency. The model is trained to maintain high retrieval accuracy and reasoning coherence across its entire context range.

Technically, GLM-4.6 is refined for agentic capabilities, supporting native tool use, function calling, and autonomous search-based reasoning. It demonstrates high proficiency in generating aesthetically polished front-end code and solving intricate programming challenges. While the model includes a "thinking mode" for tool-assisted reasoning, it is primarily categorized as a general-purpose autoregressive model rather than a specialized reinforcement-learning-driven reasoning model (such as the O1 or R1 series).

GLM-4.6 was released under a permissive MIT license, facilitating local deployment and commercial customization. This open-weight approach is intended to provide enterprises with a frontier-scale model that can be self-hosted to meet stringent data privacy and security requirements.

GLM-4.6 (Non-reasoning)

Explore AI Studio

Rankings & Comparison