Optical character recognition is the process of conversion of images of text into machine-encoded text electronically or mechanically. The text on image can be handwritten, typed or printed. Some of the examples of image source can be a picture of a document, a scanned document or a
text which is superimposed on an image. Most optical character recognition system does not give a 100% accurate result. This project aims at analyzing the error rate of a few open source optical character recognition systems (Boxoft OCR, ABBY, Tesseract, Free Online OCR etc.) on a set of diverse
documents and makes a comparative study of the same. By this, we can study which OCR is the best suited for a document.