Pixtral Large is a frontier multimodal model developed by Mistral AI, released in November 2024. Building on the architecture of Mistral Large 2, it is a 124-billion parameter model designed to process and reason across both text and visual modalities. It represents a significant scaling of the Pixtral series, aimed at complex reasoning tasks that involve high-resolution imagery and intricate documents. The model features a native vision-language architecture, incorporating a vision encoder that allows it to interpret images of varying aspect ratios and resolutions. Its capabilities include identifying fine-grained details in photographs, interpreting complex charts and diagrams, and performing optical character recognition (OCR) on dense documents. ## Technical Specifications Pixtral Large supports a 128,000-token context window, enabling the analysis of multiple images or lengthy documents within a single session. This high capacity allows it to perform multi-image reasoning, making it suitable for tasks that require comparing visual data or following visual sequences. The model maintains parity with Mistral Large 2's reasoning and multilingual strengths, supporting dozens of languages and demonstrating high performance in coding and mathematical tasks.
Explore AI Studio
Access 50+ top AI models for image, 3D, and audio generation in one unified workspace.
Open AI Studio