Qwen1.5-72B-Chat is a 72-billion parameter large language model developed by Alibaba Cloud's Qwen team. Released as the beta version of the Qwen2 series, it is a transformer-based decoder-only model specifically fine-tuned for conversational and instruction-following tasks. It represents a significant iteration over the original Qwen series, focusing on improved human preference alignment and multilingual capabilities.
The model's architecture incorporates SwiGLU activation and RoPE (Rotary Positional Embeddings). While the Qwen1.5 series introduced architectural refinements like Grouped Query Attention (GQA) for some sizes, the 72B variant in this beta release predominantly focused on standard dense attention mechanisms. It supports a uniform context length of up to 32,768 tokens across the entire Qwen1.5 family, facilitating the processing of long-form documents and extended dialogues.
Qwen1.5-72B-Chat was trained on an extensive dataset and post-trained using both Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). Evaluation benchmarks indicate strong performance in language understanding, reasoning, and mathematics, often exceeding the results of comparable models like Llama2-70B. It demonstrates robust proficiency in multiple languages, including English, Chinese, Spanish, French, Japanese, and Korean.
The model weights are open-sourced under the Tongyi Qianwen License Agreement, which permits commercial use subject to specific user threshold conditions. This release includes compatibility with the Hugging Face transformers library, ensuring streamlined integration with existing AI development ecosystems and support for various quantization formats like GPTQ, AWQ, and GGUF.