Alibaba logo
Alibaba
Open Weights

Qwen2 Instruct 72B

Released Jun 2024

Intelligence
#363
Arena AI
#189
Context131K
Parameters72B

Qwen2-72B-Instruct is a dense, large-scale language model developed by the Qwen team at Alibaba Cloud. As the flagship model of the Qwen2 series, it features 72.7 billion parameters and follows a decoder-only Transformer architecture. The model incorporates Grouped Query Attention (GQA) to optimize inference efficiency and supports a context window of up to 128,000 tokens, enabling the processing of extensive documents.

The model is fine-tuned for instruction following and exhibits strong capabilities in complex reasoning, mathematics, and programming. It was trained using a pipeline involving supervised fine-tuning (SFT) followed by Direct Preference Optimization (DPO) to enhance its alignment with human intent and improve overall response quality.

Multilingual proficiency is a core component of the model, which was trained on data covering 29 languages beyond English and Chinese. This includes support for major languages such as French, Spanish, German, Russian, Arabic, Japanese, and Korean. Its tokenizer is specifically optimized for high efficiency across both natural languages and various programming codes.

Rankings & Comparison