Alibaba logo
Alibaba
Open Weights

qwen3-coder-480b-a35b-instruct

Released Jul 2025

Arena AI
#82
Parameters480B (35B active)

Qwen3-Coder-480B-A35B-Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Alibaba's Qwen team, specifically optimized for coding and agentic tasks. Released in July 2025, it serves as a high-capacity variant of the Qwen3-Coder series. The model features a total of 480 billion parameters, utilizing a sparse activation strategy that engages only 35 billion active parameters per inference pass to balance performance with computational efficiency.

The model is built on a decoder-only transformer architecture with 62 layers and 160 specialized expert networks, activating 8 experts per token. It employs Grouped Query Attention (GQA) and supports a native context window of 256,000 tokens, which can be extended to 1 million tokens via YaRN extrapolation. This extensive context capacity is designed to facilitate repository-scale understanding and the processing of complex, multi-file codebases.

Key Capabilities

Qwen3-Coder-480B-A35B-Instruct is engineered for agentic workflows, demonstrating proficiency in autonomous multi-step problem solving, iterative debugging, and tool usage. It was trained on 7.5 trillion tokens with a 70% code-to-text ratio and further refined using long-horizon reinforcement learning (Agent RL). These optimizations enable the model to perform tasks such as repository-level reasoning, browser automation, and sophisticated function calling across a wide range of programming languages, including Python, C++, Rust, and Go.

Rankings & Comparison