Grok-1 by SpaceXAI: LLM Benchmarks, Rankings & Specs

Grok-1 is a 314 billion parameter Mixture-of-Experts (MoE) large language model developed by xAI. Released in March 2024 under the Apache 2.0 license, it is one of the largest open-weights models available to the public. The model was trained from scratch by xAI using a custom training stack built on JAX and Rust, designed for high-performance distributed computing.

Architecture and Design

The model utilizes a Mixture-of-Experts architecture, which allows it to be more computationally efficient during inference despite its massive total parameter count. Only 25% of the total parameters (approximately 86 billion) are active for any given token. It features 64 layers, 48 attention heads for queries, and 8 attention heads for keys and values. The model uses Rotary Positional Embeddings (RoPE) and has a vocabulary size of 131,072.

Training and Capabilities

Grok-1 was released as a raw base model, meaning it has not been fine-tuned for specific applications such as conversational dialogue or instruction following. It was trained on a diverse corpus of text data up until late 2023. At the time of its release, it demonstrated competitive performance on standard language modeling benchmarks, particularly in areas requiring reasoning and knowledge retrieval.

Grok-1

Architecture and Design

Training and Capabilities

Explore AI Studio

Rankings & Comparison

Grok-1

Architecture and Design

Training and Capabilities

Explore AI Studio

Rankings & Comparison