Alibaba logo
Alibaba
Open Weights

qwen1.5-7b-chat

Released Feb 2024

Arena AI
#245
Parameters7B

Qwen1.5-7B-Chat is a 7-billion parameter language model developed by the Qwen team at Alibaba Cloud. As an iterative update to the original Qwen series, it serves as a beta version for the subsequent Qwen2 release. The model is a causal decoder-only transformer that has been fine-tuned to improve human preference alignment, multilingual proficiency, and reasoning capabilities.

The architecture incorporates SwiGLU activation and Rotary Positional Embeddings (RoPE) to enhance stability and performance. It supports a context window of up to 32,768 tokens, allowing it to process and generate significantly longer documents than its predecessor. While larger models in the series utilize Grouped Query Attention (GQA), the 7B version is optimized for high-efficiency inference on consumer-grade hardware.

Training for Qwen1.5-7B-Chat involved a multi-stage pipeline including pretraining on a massive diverse corpus, followed by Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). These techniques have resulted in improved performance in coding, mathematics, and complex instruction following. The model is capable of understanding and responding in dozens of languages, including English, Chinese, French, Spanish, and Arabic.

The model is released under the Tongyi Qianwen License Agreement, which allows for research and commercial use. For commercial applications with more than 100 million monthly active users, a separate license request from Alibaba Cloud is required.

Rankings & Comparison