Niblack Binarization on Document Images: Area Efficient, Low Cost, and Noise Tolerant Stochastic Architecture

Author(s):  
Shyamali Mitra ◽  
K. C. Santosh ◽  
Mrinal Kanti Naskar

Binarization plays a crucial role in Optical Character Recognition (OCR) ancillary domains, such as recovery of degraded document images. In Document Image Analysis (DIA), selecting threshold is not trivial since it differs from one problem (dataset) to another. Instead of trying several different thresholds for one dataset to another, we consider noise inherency of document images in our proposed binarization scheme. The proposed stochastic architecture implements the local thresholding technique: Niblack’s binarization algorithm. We introduce a stochastic comparator circuit that works on unipolar stochastic numbers. Unlike the conventional stochastic circuit, it is simple and easy to deploy. We implemented it on the Xilinx Virtex6 XC6VLX760-2FF1760 FPGA platform and received encouraging experimental results. The complete set of results are available upon request. Besides, compared to conventional designs, the proposed stochastic implementation is better in terms of time complexity as well as fault-tolerant capacity.

2019 ◽  
Vol 9 (21) ◽  
pp. 4529
Author(s):  
Tao Liu ◽  
Hao Liu ◽  
Yingying Wu ◽  
Bo Yin ◽  
Zhiqiang Wei

Capturing document images using digital cameras in uneven lighting conditions is challenging, leading to poorly captured images, which hinders the processing that follows, such as Optical Character Recognition (OCR). In this paper, we propose the use of exposure bracketing techniques to solve this problem. Instead of capturing one image, we used several images that were captured with different exposure settings and used the exposure bracketing technique to generate a high-quality image that incorporates useful information from each image. We found that this technique can enhance image quality and provides an effective way of improving OCR accuracy. Our contributions in this paper are two-fold: (1) a preprocessing chain that uses exposure bracketing techniques for document images is discussed, and an automatic registration method is proposed to find the geometric disparity between multiple document images, which lays the foundation for exposure bracketing; (2) several representative exposure bracketing algorithms are incorporated in the processing chain and their performances are evaluated and compared.


Author(s):  
Priti P. Rege ◽  
Shaheera Akhter

Text separation in document image analysis is an important preprocessing step before executing an optical character recognition (OCR) task. It is necessary to improve the accuracy of an OCR system. Traditionally, for separating text from a document, different feature extraction processes have been used that require handcrafting of the features. However, deep learning-based methods are excellent feature extractors that learn features from the training data automatically. Deep learning gives state-of-the-art results on various computer vision, image classification, segmentation, image captioning, object detection, and recognition tasks. This chapter compares various traditional as well as deep-learning techniques and uses a semantic segmentation method for separating text from Devanagari document images using U-Net and ResU-Net models. These models are further fine-tuned for transfer learning to get more precise results. The final results show that deep learning methods give more accurate results compared with conventional methods of image processing for Devanagari text extraction.


2005 ◽  
Vol 14 (01) ◽  
pp. 109-122 ◽  
Author(s):  
Y. ALGINAHI ◽  
D. FEKRI ◽  
M. A. SID-AHMED

Page segmentation is necessary for optical character recognition and very useful in document image manipulation. This paper describes two classification methods, a modified linear adaptive method and a proposed neural network system that classifies an image into text, halftone image (photos, dark images, etc.), and graphics (graphs, tables, flowcharts, etc.). The blocks were segmented using the Run Length Smearing Algorithm. The smearing process was done automatically by fixing the threshold values for smearing. Features are extracted from the segmented blocks for classification into text, graphics, and halftone images. The second method uses a multi-layer perceptron neural network for classification. Two parameters, a shape factor, f1, and an angle from the rectangular block segments, were fed into the neural network system giving us three classes: text, halftone images, and graphics. Experiments on 30 mixed-content document images show that the method works well on a wide variety of layouts in document images.


Author(s):  
Neha. N

Document image processing is an increasingly important technology essential in all optical character recognition (OCR) systems and for automation of various office documents. A document originally has zero-skew (tilt), but when a page is scanned or photo copied, skew may be introduced due to various factors and is practically unavoidable. Presence even a small amount of skew (0.50) will have detrimental effects on document analysis as it has a direct effect on the reliability and efficiency of segmentation, recognition and feature extraction stages. Therefore removal of skew is of paramount importance in the field of document analysis and OCR and is the first step to be accomplished. This paper presents a novel technique for skew detection and correction which is both language and content independent. The proposed technique is based on the maximum density of the foreground pixels and their orientation in the document image. Unlike other conventional algorithms which work only for machine printed textual documents scripted in English, this technique works well for all kinds of document images (machine printed, hand written, complex, noisy and simple). The technique presented here is tested with 150 different document image samples and is found to provide results with an accuracy of 0.10


Author(s):  
V. J. Rehna ◽  
Abid Siddique ◽  
Sreenivas Naik

Aims: To introduce a cost-effective tool for reading and interpreting machine printed text in document images and save as computer-processable codes. Study Design:  In this work, emphasize is given on extracting uppercase & lowercase letters and numerals from document images by the technique of segmentation and feature extraction using MATLAB Image Processing toolbox. Place and Duration of Study: Department of Engineering, Ibri College of Technology, between September 2017 and May 2018. Methodology: Necessary information about existing algorithms on character recognition is collected by review of relevant literature available in journals, books, manuals and related documents. Suitable architecture and novel algorithm for a simple, low cost, low complexity, highly accurate system is developed as per the specifications and reviewed literature. Functionality of the design is verified using simulation software MATLAB. Results: The proposed method can extract characters from document image (which may be scanned or camera captured) of any font size, colour, space and can be rewritten in an editable window like Notepad, WordPad where the characters can even be edited; thus, improving accuracy and hence, saves time. Conclusion: This algorithm gives promising results that have been obtained on a number of images in which almost all characters are retrieved. It also gives 90 percent accuracy for all printed characters.


2017 ◽  
Vol 8 (1) ◽  
pp. 61-76 ◽  
Author(s):  
Aicha Eutamene ◽  
Mohamed Khireddine Kholladi ◽  
Djamel Gaceb ◽  
Hacene Belhadef

In the two past decades, solving complex search and optimization problems with bioinspired metaheuristic algorithms has received considerable attention among researchers. In this paper, the image preprocessing is considered as an optimization problem and the PSO (Particle Swarm Optimization) algorithm was been chosen to solve it in order to select the best parameters. The document image preprocessing step is the basis of all other steps in OCR (Optical Character Recognition) system, such as binarization, segmentation, skew correction, layout extraction, textual zones detection and OCR. Without preprocessing, the presence of degradation in the image significantly reduces the performance of these steps. The authors' contribution focuses on the preprocessing of type: smoothing and filtering document images using a new Adaptive Mean Shift algorithm based on the integral image. The local adaptation to the image quality accelerates the conventional smoothing avoiding the preprocessing of homogeneous zones. The authors' goal is to show how PSO algorithm can improve the results quality and the choice of parameters in pre-processing's methods of document images. Comparative studies as well as tests over the existing dataset have been reported to confirm the efficiency of the proposed approach.


2015 ◽  
Vol 15 (01) ◽  
pp. 1550002
Author(s):  
Brij Mohan Singh ◽  
Rahul Sharma ◽  
Debashis Ghosh ◽  
Ankush Mittal

In many documents such as maps, engineering drawings and artistic documents, etc. there exist many printed as well as handwritten materials where text regions and text-lines are not parallel to each other, curved in nature, and having various types of text such as different font size, text and non-text areas lying close to each other and non-straight, skewed and warped text-lines. Optical character recognition (OCR) systems available commercially such as ABYY fine reader and Free OCR, are not capable of handling different ranges of stylistic document images containing curved, multi-oriented, and stylish font text-lines. Extraction of individual text-lines and words from these documents is generally not straight forward. Most of the segmentation works reported is on simple documents but still it remains a highly challenging task to implement an OCR that works under all possible conditions and gives highly accurate results, especially in the case of stylistic documents. This paper presents dilation and flood fill morphological operations based approach that extracts multi-oriented text-lines and words from the complex layout or stylistic document images in the subsequent stages. The segmentation results obtained from our method proves to be superior over the standard profiling-based method.


2015 ◽  
Vol 4 (2) ◽  
pp. 74-94
Author(s):  
Pawan Kumar Singh ◽  
Ram Sarkar ◽  
Mita Nasipuri

Script identification is an appealing research interest in the field of document image analysis during the last few decades. The accurate recognition of the script is paramount to many post-processing steps such as automated document sorting, machine translation and searching of text written in a particular script in multilingual environment. For automatic processing of such documents through Optical Character Recognition (OCR) software, it is necessary to identify different script words of the documents before feeding them to the OCR of individual scripts. In this paper, a robust word-level handwritten script identification technique has been proposed using texture based features to identify the words written in any of the seven popular scripts namely, Bangla, Devanagari, Gurumukhi, Malayalam, Oriya, Telugu, and Roman. The texture based features comprise of a combination of Histograms of Oriented Gradients (HOG) and Moment invariants. The technique has been tested on 7000 handwritten text words in which each script contributes 1000 words. Based on the identification accuracies and statistical significance testing of seven well-known classifiers, Multi-Layer Perceptron (MLP) has been chosen as the final classifier which is then tested comprehensively using different folds and with different epoch sizes. The overall accuracy of the system is found to be 94.7% using 5-fold cross validation scheme, which is quite impressive considering the complexities and shape variations of the said scripts. This is an extended version of the paper described in (Singh et al., 2014).


Sign in / Sign up

Export Citation Format

Share Document