Alibaba logo
Alibaba
Open Weights

Qwen1.5 Chat 110B

Released Apr 2024

Intelligence
#399
Context32K
Parameters110B

Qwen1.5-110B-Chat is the largest dense model in the Qwen1.5 series, representing Alibaba's first open-weights model to exceed 100 billion parameters. Designed as a beta version for the Qwen2 generation, it is a transformer-based decoder-only model trained on a massive multilingual dataset. The model is specifically fine-tuned for conversational use cases with significant improvements in human preference alignment.\n\n## Architecture and Features\nThe model incorporates Grouped Query Attention (GQA), which enhances inference efficiency and reduces memory overhead. This architectural feature distinguishes the 110B variant from many of the smaller models in the same series. It supports a context window of 32,768 tokens and utilizes technical components such as SwiGLU activation and RoPE (Rotary Positional Embeddings). \n\n## Capabilities and Performance\nQwen1.5-110B-Chat demonstrates competitive performance on global benchmarks, achieving results comparable to other state-of-the-art models like Llama-3-70B. It provides robust multilingual support for dozens of languages, including English, Chinese, French, Spanish, German, and Japanese. The model was post-trained using both Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to refine its instruction-following and dialogue capabilities.

Rankings & Comparison