Qwen3 Next 80B A3B Instruct by Alibaba: LLM Benchmarks, Rankings & Specs

Qwen3-Next-80B-A3B-Instruct is a large-scale language model developed by Alibaba Cloud's Qwen team, representing a shift toward high-sparsity Mixture-of-Experts (MoE) architectures. The model features approximately 80 billion total parameters with only 3 billion parameters activated per token. This design is intended to deliver the performance of a dense high-parameter model while maintaining the inference speed and training efficiency of a much smaller system.

The model's architecture utilizes a hybrid attention mechanism that combines Gated DeltaNet (a linear attention variant) and Gated Attention. This combination is specifically designed to address the quadratic complexity of standard attention, enabling efficient processing of ultra-long contexts. It supports a native context window of 262,144 tokens, which can be extended to 1 million tokens using YaRN scaling techniques.

Optimized for instruction following, this variant is fine-tuned for tasks such as complex reasoning, code generation, and multilingual interaction. Unlike the "Thinking" versions in the same series, the Instruct model is designed for direct, stable responses without visible chain-of-thought blocks. It was trained on a 15-trillion-token corpus and incorporates a multi-token prediction mechanism to further accelerate inference throughput.

Qwen3 Next 80B A3B Instruct

Explore AI Studio

Rankings & Comparison

Qwen3 Next 80B A3B Instruct

Explore AI Studio

Rankings & Comparison