Global Binarization of Document Images Using a Neural Network

Author(s):  
Adnan Khashman ◽  
Boran Sekeroglu
Author(s):  
Gulfeshan Parween

Abstract: In this paper, we present a scheme to develop to complete OCR system for printed text English Alphabet of Uppercase of different font and of different sizes so that we can use this system in Banking, Corporate, Legal industry and so on. OCR system consists of different modules like preprocessing, segmentation, feature extraction and recognition. In preprocessing step it is expected to include image gray level conversion, binary conversion etc. After finding out the feature of the segmented characters artificial neural network and can be used for Character Recognition purpose. Efforts have been made to improve the performance of character recognition using artificial neural network techniques. The proposed OCR system is capable of accepting printed document images from a file and implemented using MATLAB R2014a version. Key words: OCR, Printed text, Barcode recognition


Author(s):  
Jia-Cheng Tu ◽  
Guo-Shiang Lin ◽  
Chao-Chuan Chang ◽  
Kuan-Cheng Huang ◽  
Ming-Hsien Tasi ◽  
...  

Author(s):  
María José Castro-Bleda ◽  
Slavador España-Boquera ◽  
Francisco Zamora-Martínez

The field of off-line optical character recognition (OCR) has been a topic of intensive research for many years (Bozinovic, 1989; Bunke, 2003; Plamondon, 2000; Toselli, 2004). One of the first steps in the classical architecture of a text recognizer is preprocessing, where noise reduction and normalization take place. Many systems do not require a binarization step, so the images are maintained in gray-level quality. Document enhancement not only influences the overall performance of OCR systems, but it can also significantly improve document readability for human readers. In many cases, the noise of document images is heterogeneous, and a technique fitted for one type of noise may not be valid for the overall set of documents. One possible solution to this problem is to use several filters or techniques and to provide a classifier to select the appropriate one. Neural networks have been used for document enhancement (see (Egmont-Petersen, 2002) for a review of image processing with neural networks). One advantage of neural network filters for image enhancement and denoising is that a different neural filter can be automatically trained for each type of noise. This work proposes the clustering of neural network filters to avoid having to label training data and to reduce the number of filters needed by the enhancement system. An agglomerative hierarchical clustering algorithm of supervised classifiers is proposed to do this. The technique has been applied to filter out the background noise from an office (coffee stains and footprints on documents, folded sheets with degraded printed text, etc.).


Sign in / Sign up

Export Citation Format

Share Document