Monday, January 19, 2015

Optical character recognition (OCR)

Systems for recognizing machine-printed text originated in the late 1950s and have been in widespread use on desktop computers since the early 1990s. It stills an active area of research because the problem is complex in nature.

OCR is the field of pattern recognition, image and natural language processing. OCR technology has advanced to the point where today’s systems are indeed useful for processing a large variety of machine-printed documents. Accuracies of 99% of more are routinely achieved on cleanly-printed pages.

Optical character recognition (OCR) is a process of scanning print pages as images on a flatbed scanner and then using OCR software to recognize the letters as ASCII text. It is a technology that involves reading typewritten, computer printed or hand-printed characters from ordinary documents and translating the images into a form that the computer can process.

The OCR software has tools for both acquiring the image from scanner and recognizing the text.

OCR works best with original or very clear copies and mono-spaced fonts like courier. For a good OCR one should use 12 point or greater font size.

With the continued advancement in microcomputer technology, further improvements in OCR can be expedited, OCR machines using dedicated microprocessor will be able to achieve greater speed and therefore be more effective in satisfying traditional OCR applications.
