Meta logo
Meta
Open Weights

Llama 3 Instruct 8B

Released Apr 2024

Meta's Llama 3 Instruct 8B is a large language model optimized for dialogue and instruction following, released as part of the initial Llama 3 family. It utilizes a standard decoder-only transformer architecture and incorporates Grouped-Query Attention (GQA) to improve inference efficiency. The model was designed to handle a variety of natural language tasks with a focus on reasoning, code generation, and instruction following.\n\nThe model was pretrained on over 15 trillion tokens of publicly available data, a significant increase over the training volume of the Llama 2 series. To optimize it for chat and assistant-like interactions, the base model underwent Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) using a dataset that included over 10 million human-annotated examples.\n\nKey technical updates include a new tokenizer with a vocabulary of 128,256 tokens, which improves encoding efficiency and contributes to overall performance gains. It supports a context window of 8,192 tokens. Meta developed the model with a focus on helpfulness and safety, employing improved alignment techniques to reduce false refusals and improve response diversity.

Rankings & Comparison