DeepSeek logo
DeepSeek
Open Weights

DeepSeek V3 0324

Released Mar 2025

DeepSeek-V3-0324 is an updated checkpoint of the DeepSeek-V3 large language model, released by DeepSeek in March 2025. This iteration introduces performance refinements to the original December 2024 architecture, focusing on improved reasoning, coding proficiency, and mathematical problem-solving. It maintains a Mixture-of-Experts (MoE) architecture with a total of 671 billion parameters, of which 37 billion are activated per token.

The model utilizes Multi-head Latent Attention (MLA) and an auxiliary-loss-free load balancing strategy to ensure efficient inference and training. It also incorporates a multi-token prediction (MTP) objective, which enhances the model's ability to handle complex generation tasks. In addition to these architectural features, DeepSeek-V3-0324 is designed to align with the writing style of the DeepSeek-R1 series, offering higher quality in medium-to-long-form content.

Key functional improvements include enhanced function calling accuracy, better JSON output stability, and optimized multi-turn interactive rewriting. The model supports a context window of 128,000 tokens and is released under the MIT License for open weight use.

Rankings & Comparison