India’s Sovereign AI Leap: Sarvam AI Open-Sources 30B and 105B Reasoning Models
- Shreyas Karanjkar

- Mar 9
- 3 min read
Updated: Mar 18

In a landmark moment for India’s technological autonomy, Bengaluru-based startup Sarvam AI officially announced the open-source release of its foundational large language models, Sarvam 30B and Sarvam 105B, on March 6, 2026.
Unveiled originally at the IndiaAI Impact Summit 2026, these models represent the first major "sovereign AI" initiative trained entirely within India using government-backed compute resources.
By releasing the model weights under the Apache 2.0 license, Sarvam AI is directly challenging the dominance of Western tech giants and providing Indian developers with a high-performance, culturally aligned alternative.
TL;DR:
Open-Source Access: Both models are now available for commercial use under the Apache 2.0 license, with weights hosted on Hugging Face and AIKosh.
Advanced Architecture: Utilizes Mixture-of-Experts (MoE), allowing the 105B model to activate only 10.3B parameters per token, ensuring massive efficiency.
Multilingual Mastery: Native support for 22 Indian languages, featuring a specialized tokenizer that reduces the "language tax" by making Indic text processing 3x faster and cheaper.
Specific Use Cases: The 30B model is optimized for real-time conversations (Samvaad), while the 105B flagship is built for complex agentic reasoning (Indus).
Sovereign Infrastructure: Trained from scratch using 4,096 Nvidia H100 GPUs provided under the IndiaAI Mission in collaboration with Yotta Data Services.
A Full-Stack Effort by Sarvam AI for the Indian Context
The release of Sarvam 30B and 105B is not merely a software update; it is a full-stack engineering feat. Co-founders Pratyush Kumar and Vivek Raghavan emphasized that the models were trained from scratch using an in-house curated dataset of over 17 trillion tokens.
Crucially, nearly 20% of this data is sourced from Indian regional content—a staggering increase compared to global models like GPT-4 or Llama, where Indic representation is often less than 1%.
The Sarvam 30B variant is designed for high-speed, real-time applications. Despite its 30-billion parameter size, its MoE architecture activates only about 1 billion parameters at a time, making it ideal for the high-volume voice-first agents required for rural Indian markets.
Meanwhile, the Sarvam 105B model features a 128,000-token context window, allowing it to "reason" through massive legal documents, financial reports, and complex coding tasks with precision that rivals global state-of-the-art models.
Breaking the "Language Tax"
One of the most significant breakthroughs is the model’s token efficiency. Traditionally, processing Indian languages has been 4 to 8 times more expensive than English due to inefficient tokenization in Western models.
Sarvam's new models slash this "language tax" by 70%, allowing businesses to deploy AI in Hindi, Tamil, or Marathi at the same cost and speed as English. This makes the models particularly attractive for government service delivery and regional enterprise workflows.
The Power of Public-Private Partnership
The development was fueled by the IndiaAI Mission, which provided a subsidy of nearly ₹100 crore for GPU access.
This public-private synergy has allowed Sarvam to build models that are not only "Made in India" but "Trained in India."
By keeping the training and data sovereign, Sarvam ensures that the "intelligence" of the model is rooted in Indian cultural nuances, legal frameworks, and social contexts rather than Western biases.
Our Thoughts
The open-sourcing of Sarvam 30B and 105B marks the beginning of a new era where India is no longer just a consumer of AI but a primary architect. As these models begin to power everything from rural kiosks to enterprise agentic workflows, Sarvam has set the stage for a localized AI revolution. With the foundation now open to the public, the next few months will likely see a surge in indigenous applications that bring the power of 100-billion-parameter reasoning to the "next billion" users.


Comments