Tencent logo
Tencent

Hy3-preview (Non-reasoning)

Released Apr 2026

Intelligence
#121
Coding
#91
Context256K
Parameters295B

Hy3-preview (Non-reasoning) is a large-scale Mixture-of-Experts (MoE) language model developed by Tencent and released in April 2026. Positioned as the flagship model of the Hunyuan 3.0 series at launch, it represents a significant architectural shift toward "all-round practicality," balancing high-level intelligence with inference efficiency. The "Non-reasoning" designation typically refers to the model's standard operation mode, which prioritizes direct, high-throughput responses over the extended Chain-of-Thought (COT) processing found in specialized reasoning variants.

The model's architecture features 295 billion total parameters, with 21 billion parameters activated per token during inference. It is composed of 80 transformer layers and utilizes 192 routed experts (of which the top 8 are activated) and one shared expert. To optimize performance, the model integrates a 3.8 billion parameter Multi-Token Prediction (MTP) layer for speculative decoding, which contributes to a reported 54% reduction in first-token latency compared to previous generations. It supports a generous 256K token context window.

In terms of capabilities, Hy3-preview demonstrates strong performance in complex instruction following, code generation, and agentic workflows. It is engineered to handle messy, multi-step contexts, such as cross-day business scheduling and budget deduplication, without speculative guesswork. The model's reliability is highlighted by its ability to refuse answers when information is incomplete, a trait aimed at production stability. It has achieved high marks on competitive benchmarks, including the Tsinghua University mathematics PhD qualifying exam and various STEM-focused evaluations.

Hy3-preview is optimized for deployment in high-demand environments, supporting tool-calling and reasoning parsers. While it maintains a high "Intelligence Index" for a model of its active parameter class, its primary design goal is cost-effective scalability. Official benchmarks indicate the model is notably fast and somewhat more verbose than average, making it suitable for applications requiring detailed explanations and complex agent tool-use.

Rankings & Comparison