Research that ships.
Our research is practical and deployment-oriented. Every research track at Medharvix is connected to a production system, a measurable benchmark, or a defined deployment target.
Low-Resource Machine Translation
In productionFine-tuning multilingual foundation models (NLLB) for translation between English and underserved Indian languages. Our evaluation framework uses held-out test sets co-built with native speakers, measuring BLEU, chrF, and human evaluation metrics.
Khasi machine translation is our flagship research track, currently shipping at BLEU 48.0 through Bha-Kha V4 on Bhasaflow, a Medharvix Systems platform.
Evaluation Infrastructure
ActiveBuilding structured evaluation pipelines for low-resource language AI. We treat evaluation as a first-class research problem: curating benchmarks, designing human evaluation protocols, and creating test sets that reflect real-world language use rather than synthetic data.
Speech Systems for Indian Languages
Under researchDeveloping text-to-speech and automatic speech recognition systems for languages with limited existing coverage. Our speech work addresses acoustic modelling, phonological adaptation, and prosody for languages like Khasi where standard speech models underperform.
Khasi TTS is in active research. Khasi ASR is under development.
Document Intelligence and OCR
Under researchResearch on reading and understanding printed and handwritten text in Indian scripts. This work supports digitisation of institutional records, preservation of historical documents, and extraction of structured information from unstructured visual inputs.
Multilingual Pipeline Architecture
ActiveDesigning modular AI pipelines that combine translation, speech, and document understanding into integrated systems. This research underpins the Bhasaflow platform architecture and enables us to add new languages and modalities without rebuilding from scratch.
