Alibaba logo
Alibaba
Open Weights

Qwen Chat 14B

Released Sep 2023

Intelligence
#445
Context8K
Parameters14B

Qwen-Chat-14B is a 14-billion parameter large language model developed by Alibaba Cloud as part of the Qwen (Tongyi Qianwen) series. It is an instruction-tuned version of the Qwen-14B base model, optimized for conversational tasks and multi-turn dialogue. The model was pre-trained on a massive dataset of over 3 trillion tokens, which includes a diverse mix of Chinese, English, code, and mathematics.

The model architecture follows a Transformer-based decoder structure with several technical refinements, including the use of RoPE (Rotary Positional Embedding), SwiGLU activation, and RMSNorm. Qwen-Chat-14B is capable of processing context windows of up to 8,192 tokens. For alignment with human preferences, Alibaba utilized Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) to ensure helpfulness and safety in its responses.

Evaluation results indicate that Qwen-Chat-14B performs competitively in benchmarks for natural language understanding, logical reasoning, and mathematical problem-solving compared to other models of similar scale. The model is released under a specific license by Alibaba that allows for both research and commercial applications.

Rankings & Comparison