Qwen-14B-Chat is a large language model developed by the Alibaba Cloud Qwen team. It is the chat-aligned version of the Qwen-14B base model, featuring 14 billion parameters. The model is designed to handle a variety of natural language tasks with a focus on high-quality performance in both Chinese and English.
The model utilizes a Transformer-based architecture with causal attention and was pre-trained on a massive dataset comprising over 3 trillion tokens. This training corpus includes a diverse mix of web data, professional documents, code, and mathematical content. To optimize it for conversational use, the model underwent supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) to improve its instruction-following and safety alignment.
Capabilities and Performance
Qwen-14B-Chat supports a context length of up to 8,000 tokens and demonstrates proficiency in reasoning, coding, and creative writing. It is particularly noted for its performance on bilingual benchmarks, often surpassing other models of similar scale in Chinese language understanding and generation tasks. The model is capable of performing complex multi-turn dialogues and solving mathematical problems.