Sarvam 105B (Reasoning) by Sarvam: LLM Benchmarks, Rankings & Specs

Sarvam 105B (Reasoning) is a foundational large language model developed by Sarvam AI, specifically engineered for deep reasoning, complex problem-solving, and agentic workflows. Trained from scratch as part of India's sovereign AI mission, the model was developed utilizing domestic compute infrastructure. It is optimized for the Indian linguistic landscape, providing full support for all 22 official Indian languages and exhibiting proficiency in handling regional context and mixed-language inputs such as Hinglish.

Architecture and Efficiency

The model utilizes a Mixture-of-Experts (MoE) architecture with a total capacity of 105 billion parameters. To optimize inference costs and speed, the system employs sparse activation, with only 10.3 billion parameters active during any single token generation. It features a Multi-Head Latent Attention (MLA) stack designed to improve representational bandwidth and attention expressivity. Additionally, the model implements YaRN scaling to maintain performance across its high-capacity context window.

Capabilities and Context

With a 128,000-token context window, Sarvam 105B is designed to process extensive datasets, including legal filings, technical manuals, and lengthy research papers. It is particularly effective for enterprise-grade applications such as automated code generation, complex mathematical reasoning, and multi-step agentic tasks. Benchmarks provided by the creators indicate that the model remains competitive with global frontier models in its size class, especially in reasoning and multilingual performance.

Open Source Availability

Sarvam 105B is released as an open-weights model under the Apache License 2.0, allowing developers and researchers to fine-tune or deploy the model for specialized use cases. The release is part of a broader effort to democratize high-performance AI in India, providing a cost-effective alternative to proprietary APIs. It is compatible with mainstream inference frameworks, including vLLM and SGLang, and was trained on a corpus of trillions of tokens spanning code, technical documentation, and regional language data.

Sarvam 105B (Reasoning)

Architecture and Efficiency

Capabilities and Context

Open Source Availability

Explore AI Studio

Rankings & Comparison