Improvement of Image Binarization Methods Using Image Preprocessing with Local Entropy Filtering for Alphanumerical Character Recognition Purposes

Automatic text recognition from the natural images acquired in uncontrolled lighting conditions is a challenging task due to the presence of shadows hindering the shape analysis and classification of individual characters. Since the optical character recognition methods require prior image binarization, the application of classical global thresholding methods in such case makes it impossible to preserve the visibility of all characters. Nevertheless, the use of adaptive binarization does not always lead to satisfactory results for heavily unevenly illuminated document images. In this paper, the image preprocessing methodology with the use of local image entropy filtering is proposed, allowing for the improvement of various commonly used image thresholding methods, which can be useful also for text recognition purposes. The proposed approach was verified using a dataset of 140 differently illuminated document images subjected to further text recognition. Experimental results, expressed as Levenshtein distances and F-Measure values for obtained text strings, are promising and confirm the usefulness of the proposed approach.

Download Full-text

Exposure Bracketing Techniques for Camera Document Image Enhancement

Applied Sciences ◽

10.3390/app9214529 ◽

2019 ◽

Vol 9 (21) ◽

pp. 4529

Author(s):

Tao Liu ◽

Hao Liu ◽

Yingying Wu ◽

Bo Yin ◽

Zhiqiang Wei

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Document Image ◽

Automatic Registration ◽

Digital Cameras ◽

Document Images ◽

Registration Method ◽

Lighting Conditions ◽

High Quality Image ◽

Multiple Document

Capturing document images using digital cameras in uneven lighting conditions is challenging, leading to poorly captured images, which hinders the processing that follows, such as Optical Character Recognition (OCR). In this paper, we propose the use of exposure bracketing techniques to solve this problem. Instead of capturing one image, we used several images that were captured with different exposure settings and used the exposure bracketing technique to generate a high-quality image that incorporates useful information from each image. We found that this technique can enhance image quality and provides an effective way of improving OCR accuracy. Our contributions in this paper are two-fold: (1) a preprocessing chain that uses exposure bracketing techniques for document images is discussed, and an automatic registration method is proposed to find the geometric disparity between multiple document images, which lays the foundation for exposure bracketing; (2) several representative exposure bracketing algorithms are incorporated in the processing chain and their performances are evaluated and compared.

Download Full-text

Robust Combined Binarization Method of Non-Uniformly Illuminated Document Images for Alphanumerical Character Recognition

Sensors ◽

10.3390/s20102914 ◽

2020 ◽

Vol 20 (10) ◽

pp. 2914

Author(s):

Hubert Michalak ◽

Krzysztof Okarma

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Recognition Accuracy ◽

Image Data ◽

Document Images ◽

Historical Document ◽

Image Binarization ◽

Optical Character ◽

Binarization Method ◽

Camera Sensors

Image binarization is one of the key operations decreasing the amount of information used in further analysis of image data, significantly influencing the final results. Although in some applications, where well illuminated images may be easily captured, ensuring a high contrast, even a simple global thresholding may be sufficient, there are some more challenging solutions, e.g., based on the analysis of natural images or assuming the presence of some quality degradations, such as in historical document images. Considering the variety of image binarization methods, as well as their different applications and types of images, one cannot expect a single universal thresholding method that would be the best solution for all images. Nevertheless, since one of the most common operations preceded by the binarization is the Optical Character Recognition (OCR), which may also be applied for non-uniformly illuminated images captured by camera sensors mounted in mobile phones, the development of even better binarization methods in view of the maximization of the OCR accuracy is still expected. Therefore, in this paper, the idea of the use of robust combined measures is presented, making it possible to bring together the advantages of various methods, including some recently proposed approaches based on entropy filtering and a multi-layered stack of regions. The experimental results, obtained for a dataset of 176 non-uniformly illuminated document images, referred to as the WEZUT OCR Dataset, confirm the validity and usefulness of the proposed approach, leading to a significant increase of the recognition accuracy.

Download Full-text

Mobile Application for Recognizing Text in Degraded Document Images Using Optical Character Recognition with Adaptive Document Image Binarization

Journal of Image and Graphics ◽

10.18178/joig.6.1.44-47 ◽

2018 ◽

Vol 6 (1) ◽

pp. 44-47

Author(s):

Angie M. Ceniza ◽

◽

Tom Kalvin B. Archival ◽

Kate V. Bongo

Keyword(s):

Mobile Application ◽

Character Recognition ◽

Optical Character Recognition ◽

Document Image ◽

Document Images ◽

Image Binarization ◽

Optical Character ◽

Document Image Binarization ◽

Degraded Document

Download Full-text

Image Enhancement of Complex Document Images Using Histogram of Gradient Features

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.36.24244 ◽

2018 ◽

Vol 7 (4.36) ◽

pp. 780

Author(s):

Sajan A. Jain ◽

N. Shobha Rani ◽

N. Chandan

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Processing System ◽

Recognition System ◽

Document Images ◽

Uniform Illumination ◽

Optical Character ◽

Research Challenge

Enhancement of document images is an interesting research challenge in the process of character recognition. It is quite significant to have a document with uniform illumination gradient to achieve higher recognition accuracies through a document processing system like Optical Character Recognition (OCR). Complex document images are one of the varied image categories that are difficult to process compared to other types of images. It is the quality of document that decides the precision of a character recognition system. Hence transforming the complex document images to a uniform illumination gradient is foreseen. In the proposed research, ancient document images of UMIACS Tobacco 800 database are considered for removal of marginal noise. The proposed technique carries out the block wise interpretation of document contents to remove the marginal noise that is present usually at the borders of images. Further, Hu moment’s features are computed for the detection of marginal noise in every block. An empirical analysis is carried out for classification of blocks into noisy or non-noisy and the outcomes produced by algorithm are satisfactory and feasible for subsequent analysis.

Download Full-text

Text Recognition from Document Images

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1951152 ◽

2019 ◽

pp. 75-79

Author(s):

Revanth Yenugudhati ◽

Suresh Babu Papanaboina ◽

Suryatej Vasireddy ◽

Yaswanth Seelam

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Reading Skills ◽

Text Recognition ◽

Current Investigation ◽

Document Images ◽

Optical Character ◽

Specific Protocol ◽

Technological Advances ◽

Machine Reading

The objective is the development of effective reading skills in machines. After reading the text and comprehending the meaning, it would know itself and according to the program, it would implement the instructions. The current investigation presents an algorithm and software which detects, recognizes text and character with specific protocol in a image and programs itself according to the text. Technological advances in image processing have acquainted us with character recognition and many such related technologies, which have proved to be a milestone. However, even years after the invention of these technologies we have not been able to achieve a technology by which machine can read, interpret and act according to the instructions and even update their database if required. Here’s an attempt to make this reality. Machine replication of human functions, like reading, is a long-awaited dream. However, over the last five decades, machine reading has transformed from a dream to reality. Text detection and character recognition known as Optical Character Recognition (OCR) has become one of the most successful applications of technology in the field of pattern recognition and artificial intelligence. Numerous commercial systems for OCR exist for a variety of applications

Download Full-text

A Dual PSO-Adaptive Mean Shift for Preprocessing Optimization on Degraded Document Images

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.2017010104 ◽

2017 ◽

Vol 8 (1) ◽

pp. 61-76 ◽

Cited By ~ 2

Author(s):

Aicha Eutamene ◽

Mohamed Khireddine Kholladi ◽

Djamel Gaceb ◽

Hacene Belhadef

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Optimization Problems ◽

Mean Shift ◽

Pso Algorithm ◽

Recognition System ◽

Document Image ◽

Document Images ◽

Image Preprocessing ◽

Mean Shift Algorithm

In the two past decades, solving complex search and optimization problems with bioinspired metaheuristic algorithms has received considerable attention among researchers. In this paper, the image preprocessing is considered as an optimization problem and the PSO (Particle Swarm Optimization) algorithm was been chosen to solve it in order to select the best parameters. The document image preprocessing step is the basis of all other steps in OCR (Optical Character Recognition) system, such as binarization, segmentation, skew correction, layout extraction, textual zones detection and OCR. Without preprocessing, the presence of degradation in the image significantly reduces the performance of these steps. The authors' contribution focuses on the preprocessing of type: smoothing and filtering document images using a new Adaptive Mean Shift algorithm based on the integral image. The local adaptation to the image quality accelerates the conventional smoothing avoiding the preprocessing of homogeneous zones. The authors' goal is to show how PSO algorithm can improve the results quality and the choice of parameters in pre-processing's methods of document images. Comparative studies as well as tests over the existing dataset have been reported to confirm the efficiency of the proposed approach.

Download Full-text

Multi-Oriented Text Extraction in Stylistic Documents

International Journal of Image and Graphics ◽

10.1142/s0219467815500023 ◽

2015 ◽

Vol 15 (01) ◽

pp. 1550002

Author(s):

Brij Mohan Singh ◽

Rahul Sharma ◽

Debashis Ghosh ◽

Ankush Mittal

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Morphological Operations ◽

Document Images ◽

Font Size ◽

Text Extraction ◽

Optical Character ◽

Engineering Drawings ◽

Flood Fill

In many documents such as maps, engineering drawings and artistic documents, etc. there exist many printed as well as handwritten materials where text regions and text-lines are not parallel to each other, curved in nature, and having various types of text such as different font size, text and non-text areas lying close to each other and non-straight, skewed and warped text-lines. Optical character recognition (OCR) systems available commercially such as ABYY fine reader and Free OCR, are not capable of handling different ranges of stylistic document images containing curved, multi-oriented, and stylish font text-lines. Extraction of individual text-lines and words from these documents is generally not straight forward. Most of the segmentation works reported is on simple documents but still it remains a highly challenging task to implement an OCR that works under all possible conditions and gives highly accurate results, especially in the case of stylistic documents. This paper presents dilation and flood fill morphological operations based approach that extracts multi-oriented text-lines and words from the complex layout or stylistic document images in the subsequent stages. The segmentation results obtained from our method proves to be superior over the standard profiling-based method.

Download Full-text

Automated hierarchical classification of scanned documents using convolutional neural network and regular expression

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v12i1.pp1018-1029 ◽

2022 ◽

Vol 12 (1) ◽

pp. 1018

Author(s):

Rifiana Arief ◽

Achmad Benny Mutiara ◽

Tubagus Maulana Kusuma ◽

Hustinawaty Hustinawaty

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Character Recognition ◽

Optical Character Recognition ◽

Regular Expression ◽

Hierarchical Classification ◽

Document Vector ◽

Classification Prediction ◽

Scanned Documents

<p>This research proposed automated hierarchical classification of scanned documents with characteristics content that have unstructured text and special patterns (specific and short strings) using convolutional neural network (CNN) and regular expression method (REM). The research data using digital correspondence documents with format PDF images from pusat data teknologi dan informasi (technology and information data center). The document hierarchy covers type of letter, type of manuscript letter, origin of letter and subject of letter. The research method consists of preprocessing, classification, and storage to database. Preprocessing covers extraction using Tesseract optical character recognition (OCR) and formation of word document vector with Word2Vec. Hierarchical classification uses CNN to classify 5 types of letters and regular expression to classify 4 types of manuscript letter, 15 origins of letter and 25 subjects of letter. The classified documents are stored in the Hive database in Hadoop big data architecture. The amount of data used is 5200 documents, consisting of 4000 for training, 1000 for testing and 200 for classification prediction documents. The trial result of 200 new documents is 188 documents correctly classified and 12 documents incorrectly classified. The accuracy of automated hierarchical classification is 94%. Next, the search of classified scanned documents based on content can be developed.</p>

Download Full-text

SCENE TEXT RECOGNITION BY USING EE-MSER AND OPTICAL CHARACTER RECOGNITION FOR NATURAL IMAGES

International Journal of Advance Engineering and Research Development ◽

10.21090/ijaerd.021219 ◽

2015 ◽

Vol 2 (12) ◽

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Natural Images ◽

Text Recognition ◽

Optical Character ◽

Scene Text ◽

Scene Text Recognition

Download Full-text

Improving the Accuracy of Tesseract 4.0 OCR Engine Using Convolution-Based Preprocessing

Symmetry ◽

10.3390/sym12050715 ◽

2020 ◽

Vol 12 (5) ◽

pp. 715

Author(s):

Dan Sporici ◽

Elena Cușnir ◽

Costin-Anton Boiangiu

Keyword(s):

Reinforcement Learning ◽

Character Recognition ◽

Optical Character Recognition ◽

Edit Distance ◽

Ground Truth ◽

Image Preprocessing ◽

Optical Character ◽

Great Performance ◽

Relative Change ◽

Reinforcement Learning Model

Optical Character Recognition (OCR) is the process of identifying and converting texts rendered in images using pixels to a more computer-friendly representation. The presented work aims to prove that the accuracy of the Tesseract 4.0 OCR engine can be further enhanced by employing convolution-based preprocessing using specific kernels. As Tesseract 4.0 has proven great performance when evaluated against a favorable input, its capability of properly detecting and identifying characters in more realistic, unfriendly images is questioned. The article proposes an adaptive image preprocessing step guided by a reinforcement learning model, which attempts to minimize the edit distance between the recognized text and the ground truth. It is shown that this approach can boost the character-level accuracy of Tesseract 4.0 from 0.134 to 0.616 (+359% relative change) and the F1 score from 0.163 to 0.729 (+347% relative change) on a dataset that is considered challenging by its authors.

Download Full-text