Alibaba logo
Alibaba
Open Weights

Qwen Image Max 2512

Released Dec 2025

Qwen Image Max 2512 is a large-scale image generation model developed by Alibaba's Qwen team, released in late 2025. Built on a 20-billion parameter Multimodal Diffusion Transformer (MMDiT) architecture, it represents a significant shift from traditional U-Net based diffusion models. The model is designed to produce high-fidelity visuals with a specific focus on reducing the artificial "plastic" look common in AI-generated imagery.

Key Capabilities

  • Enhanced Human Realism: The model incorporates architectural updates that improve the rendering of skin textures, pores, and hair, allowing for more naturalistic human portraits and varied age-related details like wrinkles and freckles.
  • Advanced Text Rendering: It excels at generating complex textual elements within images, supporting legible multilingual layouts for posters, infographics, and presentations. This includes maintaining visual hierarchy and character accuracy even with longer text strings.
  • High-Resolution Output: The system supports native resolutions up to 2048!2048, enabling the creation of finely detailed scenes across landscapes, architecture, and intricate natural textures such as water ripples and animal fur.
  • Bilingual Understanding: Optimized for both Chinese and English, the model demonstrates high instruction-following performance for nuanced prompts in both languages.

In blind human evaluations on the AI Arena platform, the model has been recognized as a top-performing open-source system, frequently compared to high-tier proprietary models in terms of prompt adherence and compositional quality. It is released under the Apache 2.0 license, allowing for both research and commercial application.

Rankings & Comparison