
Indian AI startup Sarvam AI has launched Sarvam Akshar, a document intelligence workbench designed to address the limitations of traditional OCR systems and modern multimodal models in handling complex, real-world documents. Built on the company’s Sarvam Vision model, Akshar is positioned as an intelligence layer that moves beyond basic text extraction to deliver grounded reasoning, visual understanding, and automated error correction.
The platform supports documents in English and 22 Indian languages, reflecting Sarvam’s focus on large-scale digitisation across diverse linguistic and historical datasets.
Sarvam highlighted that conventional OCR engines, including widely used open-source and enterprise solutions, rely on character-level recognition without understanding page structure. This often leads to broken reading order, poor layout interpretation, and unreliable outputs when processing newspapers, manuscripts, multi-column reports, and scanned archives.
These challenges become more pronounced for Indic scripts, where conjunct characters, diacritics, and dense formatting frequently result in misreadings. While newer vision-language models have improved multimodal understanding, Sarvam notes they still struggle with auditability, consistency, and complex layouts — making them unreliable for high-accuracy digitisation projects.
How Sarvam Akshar changes document extraction
Sarvam Akshar introduces layout-aware extraction by understanding semantic blocks such as headers, paragraphs, footnotes, and images rather than processing text line by line. The system uses visual grounding to pinpoint the exact location of each extracted element on a page.
A key feature is its automated proofreading loop, where agents identify uncertainties and probable errors, enabling faster human verification. Sarvam says this approach allows experts to review hundreds of pages in the time typically required to manually transcribe a single document.
This makes the platform particularly useful for historical archives, legal records, academic repositories, and government digitisation efforts.
Focus on scale, accuracy, and Indian languages
By combining reasoning with structured layout understanding, Akshar aims to solve what Sarvam describes as the “last-mile” problem in document intelligence — converting raw scans into reliable, structured data ready for analysis and search.
The company claims its Sarvam Vision foundation model already delivers leading benchmark performance for both global datasets and Indic OCR tasks, giving Akshar a strong base for multilingual accuracy.
With Sarvam Akshar, the startup is positioning itself to support large institutions, publishers, and enterprises looking to modernise massive document collections while reducing manual correction costs.
Sarvam Akshar is now available as part of Sarvam’s AI platform, with access opened to developers and organisations seeking advanced document reasoning workflows. The launch reflects the growing demand for reliable AI systems capable of handling complex real-world data, particularly across India’s diverse language ecosystem.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!
Find the best of Al News in one place, specially curated for you every weekend.
Stay on top of the latest tech trends and biggest startup news.