scholarly journals Robust Combined Binarization Method of Non-Uniformly Illuminated Document Images for Alphanumerical Character Recognition

Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2914
Author(s):  
Hubert Michalak ◽  
Krzysztof Okarma

Image binarization is one of the key operations decreasing the amount of information used in further analysis of image data, significantly influencing the final results. Although in some applications, where well illuminated images may be easily captured, ensuring a high contrast, even a simple global thresholding may be sufficient, there are some more challenging solutions, e.g., based on the analysis of natural images or assuming the presence of some quality degradations, such as in historical document images. Considering the variety of image binarization methods, as well as their different applications and types of images, one cannot expect a single universal thresholding method that would be the best solution for all images. Nevertheless, since one of the most common operations preceded by the binarization is the Optical Character Recognition (OCR), which may also be applied for non-uniformly illuminated images captured by camera sensors mounted in mobile phones, the development of even better binarization methods in view of the maximization of the OCR accuracy is still expected. Therefore, in this paper, the idea of the use of robust combined measures is presented, making it possible to bring together the advantages of various methods, including some recently proposed approaches based on entropy filtering and a multi-layered stack of regions. The experimental results, obtained for a dataset of 176 non-uniformly illuminated document images, referred to as the WEZUT OCR Dataset, confirm the validity and usefulness of the proposed approach, leading to a significant increase of the recognition accuracy.

2021 ◽  
Vol 4 (1) ◽  
pp. 57-70
Author(s):  
Marina V. Polyakova ◽  
Alexandr G. Nesteryuk

Optical character recognition systems for the images are used to convert books and documents into electronic form, to automate accounting systems in business, when recognizing markers using augmented reality technologies and etс. The quality of optical character recognition, provided that binarization is applied, is largely determined by the quality of separation of the foreground pixels from the background. Methods of text image binarization are analyzed and insufficient quality of binarization is noted. As a way of research the minimum-distance classifier for the improvement of the existing method of binarization of color text images is used. To improve the quality of the binarization of color text images, it is advisable to divide image pixels into two classes, “Foreground” and “Background”, to use classification methods instead of heuristic threshold selection, namely, a minimum-distance classifier. To reduce the amount of processed information before applying the classifier, it is advisable to select blocks of pixels for subsequent processing. This was done by analyzing the connected components on the original image. An improved method of the color text image binarization with the use of analysis of connected components and minimum-distance classifier has been elaborated. The research of the elaborated method showed that it is better than existing binarization methods in terms of robustness of binarization, but worse in terms of the error of the determining the boundaries of objects. Among the recognition errors, the pixels of images from the class labeled “Foreground” were more often mistaken for the class labeled “Background”. The proposed method of binarization with the uniqueness of class prototypes is recommended to be used in problems of the processing of color images of the printed text, for which the error in determining the boundaries of characters as a result of binarization is compensated by the thickness of the letters. With a multiplicity of class prototypes, the proposed binarization method is recommended to be used in problems of processing color images of handwritten text, if high performance is not required. The improved binarization method has shown its efficiency in cases of slow changes in the color and illumination of the text and background, however, abrupt changes in color and illumination, as well as a textured background, do not allowing the binarization quality required for practical problems.


2015 ◽  
Vol 15 (01) ◽  
pp. 1550002
Author(s):  
Brij Mohan Singh ◽  
Rahul Sharma ◽  
Debashis Ghosh ◽  
Ankush Mittal

In many documents such as maps, engineering drawings and artistic documents, etc. there exist many printed as well as handwritten materials where text regions and text-lines are not parallel to each other, curved in nature, and having various types of text such as different font size, text and non-text areas lying close to each other and non-straight, skewed and warped text-lines. Optical character recognition (OCR) systems available commercially such as ABYY fine reader and Free OCR, are not capable of handling different ranges of stylistic document images containing curved, multi-oriented, and stylish font text-lines. Extraction of individual text-lines and words from these documents is generally not straight forward. Most of the segmentation works reported is on simple documents but still it remains a highly challenging task to implement an OCR that works under all possible conditions and gives highly accurate results, especially in the case of stylistic documents. This paper presents dilation and flood fill morphological operations based approach that extracts multi-oriented text-lines and words from the complex layout or stylistic document images in the subsequent stages. The segmentation results obtained from our method proves to be superior over the standard profiling-based method.


2021 ◽  
Vol 9 (2) ◽  
pp. 73-84
Author(s):  
Md. Shahadat Hossain ◽  
Md. Anwar Hossain ◽  
AFM Zainul Abadin ◽  
Md. Manik Ahmed

The recognition of handwritten Bangla digit is providing significant progress on optical character recognition (OCR). It is a very critical task due to the similar pattern and alignment of handwriting digits. With the progress of modern research on optical character recognition, it is reducing the complexity of the classification task by several methods, a few problems encounter during recognition and wait to be solved with simpler methods. The modern emerging field of artificial intelligence is the Deep Neural Network, which promises a solid solution to these few handwritten recognition problems. This paper proposed a fine regulated deep neural network (FRDNN) for the handwritten numeric character recognition problem that uses convolutional neural network (CNN) models with regularization parameters which makes the model generalized by preventing the overfitting. This paper applied Traditional Deep Neural Network (TDNN) and Fine regulated deep neural network (FRDNN) models with a similar layer experienced on BanglaLekha-Isolated databases and the classification accuracies for the two models were 96.25% and 96.99%, respectively over 100 epochs. The network performance of the FRDNN model on the BanglaLekha-Isolated digit dataset was more robust and accurate than the TDNN model and depend on experimentation. Our proposed method is obtained a good recognition accuracy compared with other existing available methods.


In the proposed paper we introduce a new Pashtu numerals dataset having handwritten scanned images. We make the dataset publically available for scientific and research use. Pashtu language is used by more than fifty million people both for oral and written communication, but still no efforts are devoted to the Optical Character Recognition (OCR) system for Pashtu language. We introduce a new method for handwritten numerals recognition of Pashtu language through the deep learning based models. We use convolutional neural networks (CNNs) both for features extraction and classification tasks. We assess the performance of the proposed CNNs based model and obtained recognition accuracy of 91.45%.


Author(s):  
Ahmed Hussain Aliwy ◽  
Basheer Al-Sadawi

<p><span>An optical character recognition (OCR) refers to a process of converting the text document images into editable and searchable text. OCR process poses several challenges in particular in the Arabic language due to it has caused a high percentage of errors. In this paper, a method, to improve the outputs of the Arabic Optical character recognition (AOCR) Systems is suggested based on a statistical language model built from the available huge corpora. This method includes detecting and correcting non-word and real words error according to the context of the word in the sentence. The results show that the percentage of improvement in the results is up to (98%) as a new accuracy for AOCR output. </span></p>


Author(s):  
Dr. T. Kameswara Rao ◽  
K. Yashwanth Chowdary ◽  
I. Koushik Chowdary ◽  
K. Prasanna Kumar ◽  
Ch. Ramesh

In recent years, text extraction from document images is one of the most widely studied topics in Image Analysis and Optical Character Recognition. These extractions of document images can be used for document analysis, content analysis, document retrieval and many more. Many complex text extracting processes Maximization Likelihood (ML), Edge point detection, Corner point detection etc. are used to extract text documents from images. In this article, the corner point approach was used. To extract document from images we used a very simple approach based on FAST algorithm. Firstly, we divided the image into blocks and their density in each block was checked. The denser blocks were labeled as text blocks and the less dense were the image region or noise. Then we check the connectivity of the blocks to group the blocks so that the text part can be isolated from the image. This method is very fast and versatile, it can be used to detect various languages, handwriting and even images with a lot of noise and blur. Even though it is a very simple program the precision of this method is closer or higher than 90%. In conclusion, this method helps in more accurate and less complex detection of text from document images.


Sign in / Sign up

Export Citation Format

Share Document