A NEURAL-BASED PAGE SEGMENTATION SYSTEM

Page segmentation is necessary for optical character recognition and very useful in document image manipulation. This paper describes two classification methods, a modified linear adaptive method and a proposed neural network system that classifies an image into text, halftone image (photos, dark images, etc.), and graphics (graphs, tables, flowcharts, etc.). The blocks were segmented using the Run Length Smearing Algorithm. The smearing process was done automatically by fixing the threshold values for smearing. Features are extracted from the segmented blocks for classification into text, graphics, and halftone images. The second method uses a multi-layer perceptron neural network for classification. Two parameters, a shape factor, f1, and an angle from the rectangular block segments, were fed into the neural network system giving us three classes: text, halftone images, and graphics. Experiments on 30 mixed-content document images show that the method works well on a wide variety of layouts in document images.

Download Full-text

A radical-partitioned neural network system using a modified sigmoid function and a weight-dotted radical selector for large-volume Chinese character recognition VLSI

Proceedings of IEEE International Symposium on Circuits and Systems - ISCAS '94 ◽

10.1109/iscas.1994.409593 ◽

2002 ◽

Author(s):

J.B. Kuo ◽

B.Y. Chen ◽

M.W. Mao

Keyword(s):

Neural Network ◽

Large Volume ◽

Character Recognition ◽

Chinese Character ◽

Network System ◽

Sigmoid Function ◽

Chinese Character Recognition ◽

Neural Network System

Download Full-text

Exposure Bracketing Techniques for Camera Document Image Enhancement

Applied Sciences ◽

10.3390/app9214529 ◽

2019 ◽

Vol 9 (21) ◽

pp. 4529

Author(s):

Tao Liu ◽

Hao Liu ◽

Yingying Wu ◽

Bo Yin ◽

Zhiqiang Wei

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Document Image ◽

Automatic Registration ◽

Digital Cameras ◽

Document Images ◽

Registration Method ◽

Lighting Conditions ◽

High Quality Image ◽

Multiple Document

Capturing document images using digital cameras in uneven lighting conditions is challenging, leading to poorly captured images, which hinders the processing that follows, such as Optical Character Recognition (OCR). In this paper, we propose the use of exposure bracketing techniques to solve this problem. Instead of capturing one image, we used several images that were captured with different exposure settings and used the exposure bracketing technique to generate a high-quality image that incorporates useful information from each image. We found that this technique can enhance image quality and provides an effective way of improving OCR accuracy. Our contributions in this paper are two-fold: (1) a preprocessing chain that uses exposure bracketing techniques for document images is discussed, and an automatic registration method is proposed to find the geometric disparity between multiple document images, which lays the foundation for exposure bracketing; (2) several representative exposure bracketing algorithms are incorporated in the processing chain and their performances are evaluated and compared.

Download Full-text

Behaviour-Based Clustering of Neural Networks

Encyclopedia of Artificial Intelligence ◽

10.4018/978-1-59904-849-9.ch036 ◽

2011 ◽

pp. 231-235

Author(s):

María José Castro-Bleda ◽

Slavador España-Boquera ◽

Francisco Zamora-Martínez

Keyword(s):

Neural Network ◽

Neural Networks ◽

Character Recognition ◽

Optical Character Recognition ◽

Clustering Algorithm ◽

Training Data ◽

Document Images ◽

Supervised Classifiers ◽

Overall Performance ◽

Printed Text

The field of off-line optical character recognition (OCR) has been a topic of intensive research for many years (Bozinovic, 1989; Bunke, 2003; Plamondon, 2000; Toselli, 2004). One of the first steps in the classical architecture of a text recognizer is preprocessing, where noise reduction and normalization take place. Many systems do not require a binarization step, so the images are maintained in gray-level quality. Document enhancement not only influences the overall performance of OCR systems, but it can also significantly improve document readability for human readers. In many cases, the noise of document images is heterogeneous, and a technique fitted for one type of noise may not be valid for the overall set of documents. One possible solution to this problem is to use several filters or techniques and to provide a classifier to select the appropriate one. Neural networks have been used for document enhancement (see (Egmont-Petersen, 2002) for a review of image processing with neural networks). One advantage of neural network filters for image enhancement and denoising is that a different neural filter can be automatically trained for each type of noise. This work proposes the clustering of neural network filters to avoid having to label training data and to reduce the number of filters needed by the enhancement system. An agglomerative hierarchical clustering algorithm of supervised classifiers is proposed to do this. The technique has been applied to filter out the background noise from an office (coffee stains and footprints on documents, folded sheets with degraded printed text, etc.).

Download Full-text

Text Separation From Document Images

Machine Learning and Deep Learning in Real-Time Applications - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-7998-3095-5.ch013 ◽

2020 ◽

pp. 283-313

Author(s):

Priti P. Rege ◽

Shaheera Akhter

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Optical Character Recognition ◽

Semantic Segmentation ◽

Document Image ◽

Training Data ◽

Document Images ◽

Learning Techniques ◽

Extraction Processes ◽

Segmentation Image

Text separation in document image analysis is an important preprocessing step before executing an optical character recognition (OCR) task. It is necessary to improve the accuracy of an OCR system. Traditionally, for separating text from a document, different feature extraction processes have been used that require handcrafting of the features. However, deep learning-based methods are excellent feature extractors that learn features from the training data automatically. Deep learning gives state-of-the-art results on various computer vision, image classification, segmentation, image captioning, object detection, and recognition tasks. This chapter compares various traditional as well as deep-learning techniques and uses a semantic segmentation method for separating text from Devanagari document images using U-Net and ResU-Net models. These models are further fine-tuned for transfer learning to get more precise results. The final results show that deep learning methods give more accurate results compared with conventional methods of image processing for Devanagari text extraction.

Download Full-text

An application of a multiple neural network system with modifiable network topology (GENSEP) to online character recognition

Neural Networks ◽

10.1016/0893-6080(88)90495-9 ◽

1988 ◽

Vol 1 ◽

pp. 471

Keyword(s):

Neural Network ◽

Network Topology ◽

Character Recognition ◽

Network System ◽

Neural Network System ◽

Multiple Neural Network

Download Full-text

LANGUAGE INDEPENDENT ROBUST SKEW DETECTION AND CORRECTION TECHNIQUE FOR DOCUMENT IMAGES

International Journal of Electronics Signals and Systems ◽

10.47893/ijess.2012.1077 ◽

2012 ◽

pp. 111-115

Author(s):

Neha. N

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Document Analysis ◽

Document Image ◽

Document Images ◽

Novel Technique ◽

Skew Detection ◽

Optical Character ◽

Correction Technique ◽

All Optical

Document image processing is an increasingly important technology essential in all optical character recognition (OCR) systems and for automation of various office documents. A document originally has zero-skew (tilt), but when a page is scanned or photo copied, skew may be introduced due to various factors and is practically unavoidable. Presence even a small amount of skew (0.50) will have detrimental effects on document analysis as it has a direct effect on the reliability and efficiency of segmentation, recognition and feature extraction stages. Therefore removal of skew is of paramount importance in the field of document analysis and OCR and is the first step to be accomplished. This paper presents a novel technique for skew detection and correction which is both language and content independent. The proposed technique is based on the maximum density of the foreground pixels and their orientation in the document image. Unlike other conventional algorithms which work only for machine printed textual documents scripted in English, this technique works well for all kinds of document images (machine printed, hand written, complex, noisy and simple). The technique presented here is tested with 150 different document image samples and is found to provide results with an accuracy of 0.10

Download Full-text

Mobile Application for Recognizing Text in Degraded Document Images Using Optical Character Recognition with Adaptive Document Image Binarization

Journal of Image and Graphics ◽

10.18178/joig.6.1.44-47 ◽

2018 ◽

Vol 6 (1) ◽

pp. 44-47

Author(s):

Angie M. Ceniza ◽

◽

Tom Kalvin B. Archival ◽

Kate V. Bongo

Keyword(s):

Mobile Application ◽

Character Recognition ◽

Optical Character Recognition ◽

Document Image ◽

Document Images ◽

Image Binarization ◽

Optical Character ◽

Document Image Binarization ◽

Degraded Document

Download Full-text

Niblack Binarization on Document Images: Area Efficient, Low Cost, and Noise Tolerant Stochastic Architecture

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001421540136 ◽

2020 ◽

pp. 2154013

Author(s):

Shyamali Mitra ◽

K. C. Santosh ◽

Mrinal Kanti Naskar

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Fault Tolerant ◽

Low Cost ◽

Document Image ◽

Document Images ◽

Local Thresholding ◽

Complete Set ◽

Noise Tolerant ◽

Binarization Algorithm

Binarization plays a crucial role in Optical Character Recognition (OCR) ancillary domains, such as recovery of degraded document images. In Document Image Analysis (DIA), selecting threshold is not trivial since it differs from one problem (dataset) to another. Instead of trying several different thresholds for one dataset to another, we consider noise inherency of document images in our proposed binarization scheme. The proposed stochastic architecture implements the local thresholding technique: Niblack’s binarization algorithm. We introduce a stochastic comparator circuit that works on unipolar stochastic numbers. Unlike the conventional stochastic circuit, it is simple and easy to deploy. We implemented it on the Xilinx Virtex6 XC6VLX760-2FF1760 FPGA platform and received encouraging experimental results. The complete set of results are available upon request. Besides, compared to conventional designs, the proposed stochastic implementation is better in terms of time complexity as well as fault-tolerant capacity.

Download Full-text

Oriya Character Recognition using Neural Networks

International Journal of Computer and Communication Technology ◽

10.47893/ijcct.2012.1111 ◽

2012 ◽

pp. 33-37

Author(s):

Soumya Mishra ◽

Debashish Nanda ◽

Sanghamitra Mohanty

Keyword(s):

Neural Network ◽

Character Recognition ◽

Optical Character Recognition ◽

Back Propagation ◽

Middle Layer ◽

Back Propagation Neural Network ◽

Document Image ◽

The Neural Network ◽

Machine Readable ◽

Efficient Recognition

The biggest challenge in the field of image processing is to recognize documents both in printed and handwritten format. Optical Character Recognition (OCR) is a type of document image analysis where scanned digital image that contains either machine printed or handwritten script input into an OCR software engine and translating it into an editable machine readable digital text format. Development of OCRs for Indian script is an active area of research today. We are making an attempt to develop the OCR system for Oriya language, which is the official language of Orissa. Oriya language present great challenges to an OCR designer due to the large number of letters in the alphabet, the sophisticated ways in which they combine, and the complicated graphemes they result in. In this paper, we argue that a number of automatic and semi-automatic tools can ease the development of recognizers for new font styles and new scripts. We discuss briefly and show how they have helped build new OCRs for the purpose of recognizing Oriya script. We have used the Back propagation Neural Network for efficient recognition where the errors were corrected through back propagation and rectified neuron values were transmitted by feed-forward method in the neural network of multiple layers, i.e. the input layer, the output layer and the middle layer or hidden layers.

Download Full-text