Qwen1.5-110B-Chat is a large-scale language model developed by Alibaba's Qwen team, marking the first model in the Qwen1.5 series to surpass 100 billion parameters. Built on a decoder-only Transformer architecture, the model incorporates Grouped Query Attention (GQA) to optimize inference efficiency while maintaining significant model capacity. It serves as a bridge between the initial Qwen series and the subsequent Qwen2 generation.
The model is specifically fine-tuned for conversational AI applications and complex instruction-following, supporting a stable context window of 32,768 tokens. It demonstrates strong proficiency in multilingual communication, handling over 12 languages including Chinese, English, French, Spanish, German, and Japanese. At the time of its release, it achieved performance levels competitive with other leading open-weight models, such as Llama-3-70B, across benchmarks like MMLU, GSM8K, and HumanEval.
Architecture and Capabilities
Qwen1.5-110B-Chat utilizes advanced architectural features including SwiGLU activation and RoPE (Rotary Positional Embedding). The model was trained using a combination of large-scale pre-training on diverse datasets and post-training via supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) to align outputs with human preferences. Its scale allows for improved reasoning and knowledge retrieval compared to smaller variants in the same family.