Degraded Ancient Ashokan Brahmi Script Recognition via Self-Supervised Pretraining and WGAN-GP Degradation Pipelines

Project Abstract & Overview

Deciphering highly degraded ancient scripts is severely bottlenecked by a lack of clean, annotated data. Serving as the Computer Vision and Machine Learning Lead, I am directing a research team to build an end-to-end optical character recognition (OCR) and document analysis pipeline for ancient Ashokan Brahmi script. We engineered a massive data generation pipeline leveraging WGAN-GP to synthesize 20K+ unique character forms, subjected to physically-motivated degradation modeling to build a comprehensive 150K sequence training dataset.

Key Methodologies & Contributions