IMPROVEMENT OF THE COLOR TEXT IMAGE BINARIZATION METHOD USING THE MINIMUM-DISTANCE CLASSIFIER

2021 ◽  
Vol 4 (1) ◽  
pp. 57-70
Author(s):  
Marina V. Polyakova ◽  
Alexandr G. Nesteryuk

Optical character recognition systems for the images are used to convert books and documents into electronic form, to automate accounting systems in business, when recognizing markers using augmented reality technologies and etс. The quality of optical character recognition, provided that binarization is applied, is largely determined by the quality of separation of the foreground pixels from the background. Methods of text image binarization are analyzed and insufficient quality of binarization is noted. As a way of research the minimum-distance classifier for the improvement of the existing method of binarization of color text images is used. To improve the quality of the binarization of color text images, it is advisable to divide image pixels into two classes, “Foreground” and “Background”, to use classification methods instead of heuristic threshold selection, namely, a minimum-distance classifier. To reduce the amount of processed information before applying the classifier, it is advisable to select blocks of pixels for subsequent processing. This was done by analyzing the connected components on the original image. An improved method of the color text image binarization with the use of analysis of connected components and minimum-distance classifier has been elaborated. The research of the elaborated method showed that it is better than existing binarization methods in terms of robustness of binarization, but worse in terms of the error of the determining the boundaries of objects. Among the recognition errors, the pixels of images from the class labeled “Foreground” were more often mistaken for the class labeled “Background”. The proposed method of binarization with the uniqueness of class prototypes is recommended to be used in problems of the processing of color images of the printed text, for which the error in determining the boundaries of characters as a result of binarization is compensated by the thickness of the letters. With a multiplicity of class prototypes, the proposed binarization method is recommended to be used in problems of processing color images of handwritten text, if high performance is not required. The improved binarization method has shown its efficiency in cases of slow changes in the color and illumination of the text and background, however, abrupt changes in color and illumination, as well as a textured background, do not allowing the binarization quality required for practical problems.

Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2914
Author(s):  
Hubert Michalak ◽  
Krzysztof Okarma

Image binarization is one of the key operations decreasing the amount of information used in further analysis of image data, significantly influencing the final results. Although in some applications, where well illuminated images may be easily captured, ensuring a high contrast, even a simple global thresholding may be sufficient, there are some more challenging solutions, e.g., based on the analysis of natural images or assuming the presence of some quality degradations, such as in historical document images. Considering the variety of image binarization methods, as well as their different applications and types of images, one cannot expect a single universal thresholding method that would be the best solution for all images. Nevertheless, since one of the most common operations preceded by the binarization is the Optical Character Recognition (OCR), which may also be applied for non-uniformly illuminated images captured by camera sensors mounted in mobile phones, the development of even better binarization methods in view of the maximization of the OCR accuracy is still expected. Therefore, in this paper, the idea of the use of robust combined measures is presented, making it possible to bring together the advantages of various methods, including some recently proposed approaches based on entropy filtering and a multi-layered stack of regions. The experimental results, obtained for a dataset of 176 non-uniformly illuminated document images, referred to as the WEZUT OCR Dataset, confirm the validity and usefulness of the proposed approach, leading to a significant increase of the recognition accuracy.


2020 ◽  
Vol 8 (5) ◽  
pp. 5665-5674

Optical Character Recognition has emerged as an attractive research field nowadays. Lot of work has been done in Urdu script based on various approaches and diverse methodologies have been put forward based on Nastaliq font style. Urdu is written diagonally from top to bottom, the style known as Nastaliq. This feature of Nastaliq makes Urdu highly cursive and more sensitive leading to a difficult recognition problem. Due to the peculiarities of Nastaliq Style of writing, we have chosen ligature as a basic unit of recognition in order to reduce the complexity of system. The accuracy rate of recognizing ligature in Urdu text corresponds to the efficiency with which the ligatures are segmented. In addition to extracting connected components, the ligature segmentation takes into consideration various factors like baseline information, height, width, and centroid. In this paper ligature Recognition is performed by using multi-SVM (Sup-port Vector Machine) approach which gives an accuracy of 97% when 903 text images are fed to it.


1979 ◽  
Vol 73 (10) ◽  
pp. 389-399
Author(s):  
Gregory L. Goodrich ◽  
Richard R. Bennett ◽  
William R. De L'aune ◽  
Harvey Lauer ◽  
Leonard Mowinski

This study was designed to assess the Kurzweil Reading Machine's ability to read three different type styles produced by five different means. The results indicate that the Kurzweil Reading Machines tested have different error rates depending upon the means of producing the copy and upon the type style used; there was a significant interaction between copy method and type style. The interaction indicates that some type styles are better read when the copy is made by one means rather than another. Error rates varied between less than one percent and more than twenty percent. In general, the user will find that high quality printed materials will be read with a relatively high level of accuracy, but as the quality of the material decreases, the number of errors made by the machine also increases. As this error rate increases, the user will find it increasingly difficult to understand the spoken output.


Author(s):  
Janarthanan A ◽  
Pandiyarajan C ◽  
Sabarinathan M ◽  
Sudhan M ◽  
Kala R

Optical character recognition (OCR) is a process of text recognition in images (one word). The input images are taken from the dataset. The collected text images are implemented to pre-processing. In pre-processing, we can implement the image resize process. Image resizing is necessary when you need to increase or decrease the total number of pixels, whereas remapping can occur when you are zooming refers to increase the quantity of pixels, so that when you zoom an image, you will see clear content. After that, we can implement the segmentation process. In segmentation, we can segment the each characters in one word. We can extract the features values from the image that means test feature. In classification process, we have to classify the text from the image. Image classification is performed the images in order to identify which image contains text. A classifier is used to identify the image containing text. The experimental results shows that the accuracy.


Author(s):  
Michael Plotnikov ◽  
Paul W. Shuldiner

The ability of an automated license plate reading (ALPR) system to convert video images of license plates into computer records depends on many factors. Of these, two are readily controlled by the operator: the quality of the video images captured in the field and the internal settings of the ALPR used to transcribe these images. A third factor, the light conditions under which the license plate images are acquired, is less easily managed, especially when camcorders are used in the field under ambient light conditions. A set of experiments was conducted to test the effects of ambient light conditions, video camcorder adjustments, and internal ALPR settings on the percent of correct reads attained by a specific type of ALPR, one whose optical character recognition process is based on template matching. Images of rear license plates were collected under four ambient light conditions: overcast with no shadows, and full sunlight with the sun in front of the camcorder, behind the camcorder, and orthogonal to the line of sight. Three camcorder exposure settings were tested. Two of the settings made use of the camcorder’s internal light meter, and the third relied solely on operator judgment. The license plates read ranged from 41% to 72%, depending most strongly on ambient light conditions. In all cases, careful adjustment of the ALPR led to significantly improved read rates over those obtained by using the manufacturer’s recommended default settings. Exposure settings based on the operator’s judgment worked best in all instances.


Author(s):  
Sameer M. Patel ◽  
Sarvesh S. Pai ◽  
Mittal B. Jain ◽  
Vaibhav P. Vasani

Optical Character Recognition is basically the mechanical or electronic conversion of printed or handwritten text into machine understandable text. The complication of Optical Character Recognition in different conditions remains as relevant as it was in the past few years. At the present time of automation and innovations, Keyboarding remains the most common way of inputting or feeding data into computers. This is probably the most time consuming and labor-intensive operation in the industry. Automating the process of recognition of documents, credit cards, electronic invoices, and license plates of cars – all of this could help in saving time for analyzing and processing data. With the increased research and development of machine learning, the quality of text recognition is continuously growing better. Our paper is focused on providing a brief explanation of the different stages involved in the process of optical character recognition and through the proposed application; we aim to automate the process of extraction of important texts from electronic invoices. The main goal of the project is to develop a real time OCR web application with a micro service architecture, which would help in extracting necessary information from an invoice.


2019 ◽  
Vol 16 (2) ◽  
pp. 0409
Author(s):  
Ali Et al.

            Human Interactive Proofs (HIPs) are automatic inverse Turing tests, which are intended to differentiate between people and malicious computer programs. The mission of making good HIP system is a challenging issue, since the resultant HIP must be secure against attacks and in the same time it must be practical for humans. Text-based HIPs is one of the most popular HIPs types. It exploits the capability of humans to recite text images more than Optical Character Recognition (OCR), but the current text-based HIPs are not well-matched with rapid development of computer vision techniques, since they are either vey simply passed or very hard to resolve, thus this motivate that continuous efforts are required to improve the development of HIPs base text. In this paper, a new proposed scheme is designed for animated text-based HIP; this scheme exploits the gap between the usual perception of human and the ability of computer to mimic this perception and to achieve more secured and more human usable HIP. This scheme could prevent attacks since it's hard for the machine to distinguish characters with animation environment displayed by digital video, but it's certainly still easy and practical to be used by humans because humans are attuned to perceiving motion easily. The proposed scheme has been tested by many Optical Character Recognition applications, and it overtakes all these tests successfully and it achieves a high usability rate of 95%.


2019 ◽  
Vol 16 (2) ◽  
pp. 0409
Author(s):  
Ali Et al.

            Human Interactive Proofs (HIPs) are automatic inverse Turing tests, which are intended to differentiate between people and malicious computer programs. The mission of making good HIP system is a challenging issue, since the resultant HIP must be secure against attacks and in the same time it must be practical for humans. Text-based HIPs is one of the most popular HIPs types. It exploits the capability of humans to recite text images more than Optical Character Recognition (OCR), but the current text-based HIPs are not well-matched with rapid development of computer vision techniques, since they are either vey simply passed or very hard to resolve, thus this motivate that continuous efforts are required to improve the development of HIPs base text. In this paper, a new proposed scheme is designed for animated text-based HIP; this scheme exploits the gap between the usual perception of human and the ability of computer to mimic this perception and to achieve more secured and more human usable HIP. This scheme could prevent attacks since it's hard for the machine to distinguish characters with animation environment displayed by digital video, but it's certainly still easy and practical to be used by humans because humans are attuned to perceiving motion easily. The proposed scheme has been tested by many Optical Character Recognition applications, and it overtakes all these tests successfully and it achieves a high usability rate of 95%.


2021 ◽  
Vol 11 (6) ◽  
pp. 7968-7973
Author(s):  
M. Kazmi ◽  
F. Yasir ◽  
S. Habib ◽  
M. S. Hayat ◽  
S. A. Qazi

Urdu Optical Character Recognition (OCR) based on character level recognition (analytical approach) is less popular as compared to ligature level recognition (holistic approach) due to its added complexity, characters and strokes overlapping. This paper presents a holistic approach Urdu ligature extraction technique. The proposed Photometric Ligature Extraction (PLE) technique is independent of font size and column layout and is capable to handle non-overlapping and all inter and intra overlapping ligatures. It uses a customized photometric filter along with the application of X-shearing and padding with connected component analysis, to extract complete ligatures instead of extracting primary and secondary ligatures separately. A total of ~ 2,67,800 ligatures were extracted from scanned Urdu Nastaliq printed text images with an accuracy of 99.4%. Thus, the proposed framework outperforms the existing Urdu Nastaliq text extraction and segmentation algorithms. The proposed PLE framework can also be applied to other languages using the Nastaliq script style, languages such as Arabic, Persian, Pashto, and Sindhi.


Sign in / Sign up

Export Citation Format

Share Document