Tencent logo
Tencent

Hunyuan3D 3.1

Released Feb 2026

Crafiq Arena
#1
Parameters10B

Hunyuan3D 3.1 is a generative 3D model developed by Tencent, representing a significant optimization of the company's 3D asset generation suite. It utilizes a hierarchical 3D-DiT (Diffusion Transformer) architecture, an advancement first introduced in the 3.0 series to resolve the challenge of balancing global structure with intricate local details. The model is capable of producing high-fidelity 3D assets from natural language descriptions or single-view and multi-view images, generating production-ready meshes with clean, quad-dominant topology.

Technical Architecture

The model's 10-billion parameter scale facilitates a multi-stage pipeline that separates geometry generation from texture synthesis. The process begins with an input encoding stage (text or image), followed by a latent-space shape generation phase powered by the flow-based hierarchical DiT. The final stages involve mesh reconstruction and a dedicated high-resolution texture synthesis module that produces Physically Based Rendering (PBR) materials, including albedo, normal, and roughness maps. This architecture allows the model to achieve up to a threefold improvement in modeling accuracy compared to earlier iterations.

Capabilities and Workflow

Hunyuan3D 3.1 supports the generation of assets with high polygon counts (up to 1.5 million faces) and provides versatile export options in industry-standard formats such as OBJ, GLB, and FBX. Its enhanced geometry accuracy and faster generation speeds—typically under two minutes—make it suitable for professional workflows in game development, AR/VR, and 3D printing.

For optimal results when using image-to-3D, users should provide reference images with a simple, neutral background and a single object that occupies at least 50% of the frame. For text-to-3D, detailed prompts that specify both the object's structural properties and its material finish (e.g., "weathered copper" or "polished marble") help the model produce more consistent texture mapping. The system is also capable of part segmentation, allowing for more complex object decompositions and structural coherence.

Rankings & Comparison