Llama 3.2 3B Instruct is a lightweight, multilingual large language model developed by Meta, specifically optimized for edge computing and mobile devices. As part of the Llama 3.2 collection, this instruction-tuned model balances efficiency with performance, making it suitable for tasks such as on-device summarization, query rewriting, and agentic retrieval.
Architecture and Development
The model utilizes an auto-regressive transformer architecture. It was developed using a process that involved pruning and knowledge distillation from larger Llama 3.1 models—specifically the 8B and 70B versions—to retain high levels of reasoning and understanding within a smaller parameter footprint. Following the pre-training stage, the model underwent multiple rounds of alignment using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF).
Capabilities and Multilingual Support
Llama 3.2 3B Instruct supports a context window of 128k tokens, allowing it to process long-form documents and follow complex instructions. It officially supports eight languages, including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. The model is designed to operate in highly constrained environments where low latency and local privacy are critical requirements.