Hindi font samples

For the Darpa TIDES "surprise language" exercise I pulled up from tertiary storage full page images from the font sample booklet of some Hindi printing press. While there is no direct OCR component to the exercise (and why not, Charles?), those shops that actually use OCR to enhance their data collection efforts may still find this useful for calibrating their software. Here is the gzipped tar file, a sample page (page 2 of 48 300dpi TIFs) is included below. If you perform any work on these pages that would enhance the dataset (rotate the images the right way, segment out the various samples, ground truth the data etc.) I would appreciate if you'd give me back a copy so that I could post it here -- your contribution will be fully acknowledged.

Andras Kornai
June 5 2003

Ground truth (for testing with Tesseract) for the Marathi pages are now kindly provided by ShreeDevi Kumar at github

February 23 2016