Niblack Binarization on Document Images: Area Efficient, Low Cost, and Noise Tolerant Stochastic Architecture

Binarization plays a crucial role in Optical Character Recognition (OCR) ancillary domains, such as recovery of degraded document images. In Document Image Analysis (DIA), selecting threshold is not trivial since it differs from one problem (dataset) to another. Instead of trying several different thresholds for one dataset to another, we consider noise inherency of document images in our proposed binarization scheme. The proposed stochastic architecture implements the local thresholding technique: Niblack’s binarization algorithm. We introduce a stochastic comparator circuit that works on unipolar stochastic numbers. Unlike the conventional stochastic circuit, it is simple and easy to deploy. We implemented it on the Xilinx Virtex6 XC6VLX760-2FF1760 FPGA platform and received encouraging experimental results. The complete set of results are available upon request. Besides, compared to conventional designs, the proposed stochastic implementation is better in terms of time complexity as well as fault-tolerant capacity.

Download Full-text

Exposure Bracketing Techniques for Camera Document Image Enhancement

Applied Sciences ◽

10.3390/app9214529 ◽

2019 ◽

Vol 9 (21) ◽

pp. 4529

Author(s):

Tao Liu ◽

Hao Liu ◽

Yingying Wu ◽

Bo Yin ◽

Zhiqiang Wei

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Document Image ◽

Automatic Registration ◽

Digital Cameras ◽

Document Images ◽

Registration Method ◽

Lighting Conditions ◽

High Quality Image ◽

Multiple Document

Capturing document images using digital cameras in uneven lighting conditions is challenging, leading to poorly captured images, which hinders the processing that follows, such as Optical Character Recognition (OCR). In this paper, we propose the use of exposure bracketing techniques to solve this problem. Instead of capturing one image, we used several images that were captured with different exposure settings and used the exposure bracketing technique to generate a high-quality image that incorporates useful information from each image. We found that this technique can enhance image quality and provides an effective way of improving OCR accuracy. Our contributions in this paper are two-fold: (1) a preprocessing chain that uses exposure bracketing techniques for document images is discussed, and an automatic registration method is proposed to find the geometric disparity between multiple document images, which lays the foundation for exposure bracketing; (2) several representative exposure bracketing algorithms are incorporated in the processing chain and their performances are evaluated and compared.

Download Full-text

Text Separation From Document Images

Machine Learning and Deep Learning in Real-Time Applications - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-7998-3095-5.ch013 ◽

2020 ◽

pp. 283-313

Author(s):

Priti P. Rege ◽

Shaheera Akhter

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Optical Character Recognition ◽

Semantic Segmentation ◽

Document Image ◽

Training Data ◽

Document Images ◽

Learning Techniques ◽

Extraction Processes ◽

Segmentation Image

Text separation in document image analysis is an important preprocessing step before executing an optical character recognition (OCR) task. It is necessary to improve the accuracy of an OCR system. Traditionally, for separating text from a document, different feature extraction processes have been used that require handcrafting of the features. However, deep learning-based methods are excellent feature extractors that learn features from the training data automatically. Deep learning gives state-of-the-art results on various computer vision, image classification, segmentation, image captioning, object detection, and recognition tasks. This chapter compares various traditional as well as deep-learning techniques and uses a semantic segmentation method for separating text from Devanagari document images using U-Net and ResU-Net models. These models are further fine-tuned for transfer learning to get more precise results. The final results show that deep learning methods give more accurate results compared with conventional methods of image processing for Devanagari text extraction.

Download Full-text

A NEURAL-BASED PAGE SEGMENTATION SYSTEM

Journal of Circuits System and Computers ◽

10.1142/s0218126605002192 ◽

2005 ◽

Vol 14 (01) ◽

pp. 109-122 ◽

Cited By ~ 1

Author(s):

Y. ALGINAHI ◽

D. FEKRI ◽

M. A. SID-AHMED

Keyword(s):

Neural Network ◽

Character Recognition ◽

Optical Character Recognition ◽

Adaptive Method ◽

Document Image ◽

Network System ◽

Document Images ◽

Page Segmentation ◽

Neural Network System ◽

Halftone Images

Page segmentation is necessary for optical character recognition and very useful in document image manipulation. This paper describes two classification methods, a modified linear adaptive method and a proposed neural network system that classifies an image into text, halftone image (photos, dark images, etc.), and graphics (graphs, tables, flowcharts, etc.). The blocks were segmented using the Run Length Smearing Algorithm. The smearing process was done automatically by fixing the threshold values for smearing. Features are extracted from the segmented blocks for classification into text, graphics, and halftone images. The second method uses a multi-layer perceptron neural network for classification. Two parameters, a shape factor, f1, and an angle from the rectangular block segments, were fed into the neural network system giving us three classes: text, halftone images, and graphics. Experiments on 30 mixed-content document images show that the method works well on a wide variety of layouts in document images.

Download Full-text

LANGUAGE INDEPENDENT ROBUST SKEW DETECTION AND CORRECTION TECHNIQUE FOR DOCUMENT IMAGES

International Journal of Electronics Signals and Systems ◽

10.47893/ijess.2012.1077 ◽

2012 ◽

pp. 111-115

Author(s):

Neha. N

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Document Analysis ◽

Document Image ◽

Document Images ◽

Novel Technique ◽

Skew Detection ◽

Optical Character ◽

Correction Technique ◽

All Optical

Document image processing is an increasingly important technology essential in all optical character recognition (OCR) systems and for automation of various office documents. A document originally has zero-skew (tilt), but when a page is scanned or photo copied, skew may be introduced due to various factors and is practically unavoidable. Presence even a small amount of skew (0.50) will have detrimental effects on document analysis as it has a direct effect on the reliability and efficiency of segmentation, recognition and feature extraction stages. Therefore removal of skew is of paramount importance in the field of document analysis and OCR and is the first step to be accomplished. This paper presents a novel technique for skew detection and correction which is both language and content independent. The proposed technique is based on the maximum density of the foreground pixels and their orientation in the document image. Unlike other conventional algorithms which work only for machine printed textual documents scripted in English, this technique works well for all kinds of document images (machine printed, hand written, complex, noisy and simple). The technique presented here is tested with 150 different document image samples and is found to provide results with an accuracy of 0.10

Download Full-text

Mobile Application for Recognizing Text in Degraded Document Images Using Optical Character Recognition with Adaptive Document Image Binarization

Journal of Image and Graphics ◽

10.18178/joig.6.1.44-47 ◽

2018 ◽

Vol 6 (1) ◽

pp. 44-47

Author(s):

Angie M. Ceniza ◽

◽

Tom Kalvin B. Archival ◽

Kate V. Bongo

Keyword(s):

Mobile Application ◽

Character Recognition ◽

Optical Character Recognition ◽

Document Image ◽

Document Images ◽

Image Binarization ◽

Optical Character ◽

Document Image Binarization ◽

Degraded Document

Download Full-text

Reading and Interpreting Machine Printed Text in Camera-Captured Document Images

Journal of Engineering Research and Reports ◽

10.9734/jerr/2018/v2i19904 ◽

2018 ◽

pp. 1-10

Author(s):

V. J. Rehna ◽

Abid Siddique ◽

Sreenivas Naik

Keyword(s):

Character Recognition ◽

Low Cost ◽

Relevant Literature ◽

Cost Effective ◽

Low Complexity ◽

Simulation Software ◽

Document Image ◽

Document Images ◽

Printed Text ◽

Percent Accuracy

Aims: To introduce a cost-effective tool for reading and interpreting machine printed text in document images and save as computer-processable codes. Study Design: In this work, emphasize is given on extracting uppercase & lowercase letters and numerals from document images by the technique of segmentation and feature extraction using MATLAB Image Processing toolbox. Place and Duration of Study: Department of Engineering, Ibri College of Technology, between September 2017 and May 2018. Methodology: Necessary information about existing algorithms on character recognition is collected by review of relevant literature available in journals, books, manuals and related documents. Suitable architecture and novel algorithm for a simple, low cost, low complexity, highly accurate system is developed as per the specifications and reviewed literature. Functionality of the design is verified using simulation software MATLAB. Results: The proposed method can extract characters from document image (which may be scanned or camera captured) of any font size, colour, space and can be rewritten in an editable window like Notepad, WordPad where the characters can even be edited; thus, improving accuracy and hence, saves time. Conclusion: This algorithm gives promising results that have been obtained on a number of images in which almost all characters are retrieved. It also gives 90 percent accuracy for all printed characters.

Download Full-text

A Dual PSO-Adaptive Mean Shift for Preprocessing Optimization on Degraded Document Images

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.2017010104 ◽

2017 ◽

Vol 8 (1) ◽

pp. 61-76 ◽

Cited By ~ 2

Author(s):

Aicha Eutamene ◽

Mohamed Khireddine Kholladi ◽

Djamel Gaceb ◽

Hacene Belhadef

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Optimization Problems ◽

Mean Shift ◽

Pso Algorithm ◽

Recognition System ◽

Document Image ◽

Document Images ◽

Image Preprocessing ◽

Mean Shift Algorithm

In the two past decades, solving complex search and optimization problems with bioinspired metaheuristic algorithms has received considerable attention among researchers. In this paper, the image preprocessing is considered as an optimization problem and the PSO (Particle Swarm Optimization) algorithm was been chosen to solve it in order to select the best parameters. The document image preprocessing step is the basis of all other steps in OCR (Optical Character Recognition) system, such as binarization, segmentation, skew correction, layout extraction, textual zones detection and OCR. Without preprocessing, the presence of degradation in the image significantly reduces the performance of these steps. The authors' contribution focuses on the preprocessing of type: smoothing and filtering document images using a new Adaptive Mean Shift algorithm based on the integral image. The local adaptation to the image quality accelerates the conventional smoothing avoiding the preprocessing of homogeneous zones. The authors' goal is to show how PSO algorithm can improve the results quality and the choice of parameters in pre-processing's methods of document images. Comparative studies as well as tests over the existing dataset have been reported to confirm the efficiency of the proposed approach.

Download Full-text

Multi-Oriented Text Extraction in Stylistic Documents

International Journal of Image and Graphics ◽

10.1142/s0219467815500023 ◽

2015 ◽

Vol 15 (01) ◽

pp. 1550002

Author(s):

Brij Mohan Singh ◽

Rahul Sharma ◽

Debashis Ghosh ◽

Ankush Mittal

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Morphological Operations ◽

Document Images ◽

Font Size ◽

Text Extraction ◽

Optical Character ◽

Engineering Drawings ◽

Flood Fill

In many documents such as maps, engineering drawings and artistic documents, etc. there exist many printed as well as handwritten materials where text regions and text-lines are not parallel to each other, curved in nature, and having various types of text such as different font size, text and non-text areas lying close to each other and non-straight, skewed and warped text-lines. Optical character recognition (OCR) systems available commercially such as ABYY fine reader and Free OCR, are not capable of handling different ranges of stylistic document images containing curved, multi-oriented, and stylish font text-lines. Extraction of individual text-lines and words from these documents is generally not straight forward. Most of the segmentation works reported is on simple documents but still it remains a highly challenging task to implement an OCR that works under all possible conditions and gives highly accurate results, especially in the case of stylistic documents. This paper presents dilation and flood fill morphological operations based approach that extracts multi-oriented text-lines and words from the complex layout or stylistic document images in the subsequent stages. The segmentation results obtained from our method proves to be superior over the standard profiling-based method.

Download Full-text

Word-Level Script Identification Using Texture Based Features

International Journal of System Dynamics Applications ◽

10.4018/ijsda.2015040105 ◽

2015 ◽

Vol 4 (2) ◽

pp. 74-94

Author(s):

Pawan Kumar Singh ◽

Ram Sarkar ◽

Mita Nasipuri

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Statistical Significance ◽

Document Image ◽

Statistical Significance Testing ◽

Script Identification ◽

Word Level ◽

Histograms Of Oriented Gradients ◽

Handwritten Text ◽

Identification Technique

Script identification is an appealing research interest in the field of document image analysis during the last few decades. The accurate recognition of the script is paramount to many post-processing steps such as automated document sorting, machine translation and searching of text written in a particular script in multilingual environment. For automatic processing of such documents through Optical Character Recognition (OCR) software, it is necessary to identify different script words of the documents before feeding them to the OCR of individual scripts. In this paper, a robust word-level handwritten script identification technique has been proposed using texture based features to identify the words written in any of the seven popular scripts namely, Bangla, Devanagari, Gurumukhi, Malayalam, Oriya, Telugu, and Roman. The texture based features comprise of a combination of Histograms of Oriented Gradients (HOG) and Moment invariants. The technique has been tested on 7000 handwritten text words in which each script contributes 1000 words. Based on the identification accuracies and statistical significance testing of seven well-known classifiers, Multi-Layer Perceptron (MLP) has been chosen as the final classifier which is then tested comprehensively using different folds and with different epoch sizes. The overall accuracy of the system is found to be 94.7% using 5-fold cross validation scheme, which is quite impressive considering the complexities and shape variations of the said scripts. This is an extended version of the paper described in (Singh et al., 2014).

Download Full-text

Handwritten Kannada Document Image Processing using Optical Character Recognition

IOSR Journal of Computer Engineering ◽

10.9790/0661-1804063947 ◽

2016 ◽

Vol 18 (04) ◽

pp. 39-47

Author(s):

Mayur M Patil ◽

Akkamahadevi R Hanni

Keyword(s):

Image Processing ◽

Character Recognition ◽

Optical Character Recognition ◽

Document Image ◽

Document Image Processing ◽

Optical Character

Download Full-text