glm-4-0520 is a proprietary large language model developed by Zhipu AI and released in May 2024. It serves as an updated iteration of the flagship GLM-4 model, succeeding the previous 0116 version. The model is designed for high-performance natural language understanding, complex reasoning, and multimodal tasks.
Technically, the model utilizes Grouped Query Attention (GQA) to improve inference efficiency and reduce the memory footprint of the KV cache. It was pre-trained on a multilingual corpus exceeding 10 trillion tokens, primarily in Chinese and English. According to technical reports, glm-4-0520 demonstrates performance comparable to frontier models like GPT-4 and Gemini 1.5 Pro across benchmarks such as MMLU and GSM8K.
The model is highly optimized for agentic workflows and "All Tools" capabilities, including native support for web search, code execution through a Python interpreter, and robust function calling. It supports a context window of 128,000 tokens, maintaining high accuracy in long-context retrieval tasks such as "needle in a haystack" evaluations.