Databricks logo
Databricks
Open Weights

DBRX Instruct

Released Mar 2024

Intelligence
#426
Arena AI
#217
Context33K
Parameters132B

DBRX Instruct is an open-weights large language model developed by Databricks using a fine-grained Mixture-of-Experts (MoE) architecture. As the instruction-tuned variant of the DBRX family, it is optimized for high-quality conversational performance, reasoning, and code generation. The model consists of 132 billion total parameters, with approximately 36 billion active during any single inference pass, striking a balance between model capacity and computational efficiency. ## Architecture and Training The model's architecture utilizes 16 experts, selecting 4 for each token, which provides a higher number of expert combinations than other MoE models like Mixtral or Grok-1. It was pre-trained on a massive dataset of 12 trillion tokens of curated text and code. DBRX Instruct supports a context window of up to 32,768 tokens and employs modern transformer techniques such as Rotary Position Encodings (RoPE), Gated Linear Units (GLU), and Grouped Query Attention (GQA). ## Capabilities and License DBRX Instruct is designed as a general-purpose assistant with specialized strengths in programming and mathematical reasoning. Developed using a curriculum learning approach, the model demonstrated performance levels competitive with or exceeding several established open models at its launch. It is released under the Databricks Open Model License, allowing for commercial use subject to specific terms.

Rankings & Comparison