Qwen3.5 397B A17B (Reasoning) by Alibaba: LLM Benchmarks, Rankings & Specs

Qwen3.5-397B-A17B is a large-scale multimodal language model released by Alibaba's Qwen team in February 2026. Built on a hybrid Mixture-of-Experts (MoE) architecture, it features 397 billion total parameters with only 17 billion active per token, enabling a balance between high-capacity knowledge and inference efficiency. The model is a native vision-language foundation, trained through early fusion to process text, images, and video within a single unified pipeline.

The model's architecture introduces Gated Delta Networks combined with linear attention mechanisms, which significantly reduces computational overhead compared to traditional quadratic attention. This design allows for a decoding throughput reported to be up to 19 times faster than previous flagship models like Qwen3-Max. By utilizing a high ratio of total-to-active parameters, the model maintains deep specialized knowledge across 201 languages and dialects while remaining tractable for local deployment on high-end consumer hardware using advanced quantization.

Optimized for complex problem-solving, the model includes a native Thinking Mode enabled by large-scale reinforcement learning (RL). When active, the model generates internal chain-of-thought reasoning within specific tags before providing a final response. This reasoning process excels in STEM fields, agentic workflows, and multi-step coding tasks, where it demonstrates improved performance in verifying its own logic and following intricate instructions.

It natively supports a context window of 262,144 tokens, which can be extended to over 1,000,000 tokens for long-context applications such as document analysis or processing up to two hours of video content. The model is released under an Apache 2.0 license, supporting both research and commercial applications.

Qwen3.5 397B A17B (Reasoning)

Explore AI Studio

Rankings & Comparison