Tsinghua
Open Weights

chatglm-6b

Released Mar 2023

Arena AI
#273
Context2K
Parameters6.2B

ChatGLM-6B is an open-source, bilingual language model designed for dialogue and question-answering in both Chinese and English. Developed by the Knowledge Engineering Group (KEG) and the Data Mining Group (THUDM) at Tsinghua University, it is based on the General Language Model (GLM) architecture. The model contains 6.2 billion parameters and was trained on approximately 1 trillion tokens of high-quality bilingual corpora.

Architecture and Training

The model utilizes an autoregressive blank-infilling objective, which allows it to handle both natural language understanding and generation tasks effectively. It was further refined through supervised fine-tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) to better align with human conversational preferences. To improve efficiency for local deployment, ChatGLM-6B supports model quantization, allowing it to run on consumer-grade hardware with as little as 6GB of VRAM when using INT4 quantization.

Performance and Capabilities

ChatGLM-6B is specifically optimized for Chinese-language interactions while maintaining strong performance in English. It supports a context length of 2,048 tokens and utilizes relative position encoding. While it is the foundational model in the ChatGLM series, its open-weights release significantly contributed to the development of efficient, locally-deployable bilingual AI assistants.

Rankings & Comparison