DeepSeek logo
DeepSeek
Open Weights

DeepSeek Coder V2 Lite Instruct

Released Jun 2024

Intelligence
#419
Context128K
Parameters16B

DeepSeek-Coder-V2-Lite-Instruct is an open-source Mixture-of-Experts (MoE) code language model designed for efficient performance in programming and mathematical reasoning. As a scaled-down version of the DeepSeek-Coder-V2 series, it balances computational efficiency with high-level coding proficiency across a wide range of programming languages.\n\nThe model architecture comprises 16 billion total parameters, utilizing 2.4 billion active parameters during inference. It incorporates Multi-head Latent Attention (MLA), a specialized mechanism intended to optimize inference speed and minimize the memory footprint of the Key-Value (KV) cache. This architecture allows the model to handle a context length of up to 128,000 tokens.\n\nDeepSeek-Coder-V2-Lite-Instruct supports over 300 programming languages, representing a significant expansion over its predecessor. It is trained on a diverse dataset containing trillions of tokens, with a heavy emphasis on code and mathematics, enabling tasks such as code generation, debugging, and complex algorithmic reasoning.

Rankings & Comparison