A Survey on Arabic Handwritten Script Recognition Systems

The optical character recognition (OCR) system is still an active research field in pattern recognition. Such systems can identify, recognize and distinguish electronically between characters and texts, printed or handwritten. They can also do a transformation of such data type into machine-processable form to facilitate the interaction between user and machine in various applications. In this paper, we present the global structure of an OCR system, with its types (on-line and off-line), categories (printed and handwritten) and its main steps. We also focused on off-line handwritten Arabic character recognition and provided a list of the main datasets publicly available. This paper also presents a survey of the works that have been carried out over recent years. Finally, some open issues and potential research directions have been highlighted

Download Full-text

Identification and correction of rejection and substitution errors in optical character recognition systems

10.1117/12.143616 ◽

1993 ◽

Author(s):

Glenn S. Himes ◽

Marty M. Scholl ◽

Frank A. DeCosta III

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Optical Character ◽

Recognition Systems

Download Full-text

Word and Chracter Segmentation in Devnagari and Odia Script – A Comparitive Analysis

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i7060.079920 ◽

2020 ◽

Vol 9 (9) ◽

pp. 377-382

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Research Area ◽

Character Segmentation ◽

National Language ◽

Eastern India ◽

Regional Language ◽

Optical Character ◽

Active Research ◽

Active Research Area

Optical Character Recognition has been an active research area in computer science for several years. Several research works undertaken on various languages in India. In this paper an attempt has been made to find out the percentage of accuracy in word and character segmentation of Hindi (National language of India) and Odia is one of the Regional Language mostly spoken in Odisha and a few Eastern India states. A comparative article has been published under this article. 10 sets of each printed Odia and Devanagari scripts with different word limits were used in this study. The documents were scanned at 300dpi before adopting pre-processing and segmentation procedure. The result shows that the percentage of accuracy both in word and character segmentation is higher in Odia language as compared to Hindi language. One of the reasons is the use of headers line in Hindi which makes the segmentation process cumbersome. Thus, it can be concluded that the accuracy level can vary from one language to the other and from word segmentation to that of the character segmentation.

Download Full-text

Optical Character Recognition Systems for German Language

Optical Character Recognition Systems for Different Languages with Soft Computing - Studies in Fuzziness and Soft Computing ◽

10.1007/978-3-319-50252-6_6 ◽

2016 ◽

pp. 137-164

Author(s):

Arindam Chaudhuri ◽

Krupa Mandaviya ◽

Pratixa Badelia ◽

Soumya K. Ghosh

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

German Language ◽

Optical Character ◽

Recognition Systems

Download Full-text

Optical Character Recognition Systems

Optical Character Recognition Systems for Different Languages with Soft Computing - Studies in Fuzziness and Soft Computing ◽

10.1007/978-3-319-50252-6_2 ◽

2016 ◽

pp. 9-41 ◽

Cited By ~ 12

Author(s):

Arindam Chaudhuri ◽

Krupa Mandaviya ◽

Pratixa Badelia ◽

Soumya K. Ghosh

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Optical Character ◽

Recognition Systems

Download Full-text

WAVELET DESCRIPTORS FOR RECOGNITION OF BASIC SYMBOLS IN PRINTED KANNADA TEXT

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691307001793 ◽

2007 ◽

Vol 05 (02) ◽

pp. 351-367 ◽

Cited By ~ 12

Author(s):

R. SANJEEV KUNTE ◽

R. D. SUDHAKER SAMUEL

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Indian Languages ◽

Indian Language ◽

South Indian ◽

Wavelet Features ◽

On Line ◽

Recognition Systems ◽

System Methodology

Optical Character Recognition (OCR) systems have been effectively developed for the recognition of printed characters of non-Indian languages. Efforts are underway for the development of efficient OCR systems for Indian languages, especially for Kannada, a popular South Indian language. We present in this paper an OCR system developed for the recognition of basic characters in printed Kannada text, which can handle different font sizes and font sets. Wavelets that have been progressively used in pattern recognition and on-line character recognition systems are used in our system to extract the features of printed Kannada characters. Neural classifiers have been effectively used for the classification of characters based on wavelet features. The system methodology can be extended for the recognition of other south Indian languages, especially for Telugu.

Download Full-text

Performance Analysis of Open Source Optical Character Recognition

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9060 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4267-4275

Author(s):

Jagadish Kallimani ◽

Chandrika Prasad ◽

D. Keerthana ◽

Manoj J. Shet ◽

Prasada Hegde ◽

...

Keyword(s):

Performance Analysis ◽

Comparative Study ◽

Open Source ◽

Error Rate ◽

Character Recognition ◽

Optical Character Recognition ◽

Accurate Result ◽

Recognition System ◽

Optical Character ◽

Recognition Systems

Optical character recognition is the process of conversion of images of text into machine-encoded text electronically or mechanically. The text on image can be handwritten, typed or printed. Some of the examples of image source can be a picture of a document, a scanned document or a text which is superimposed on an image. Most optical character recognition system does not give a 100% accurate result. This project aims at analyzing the error rate of a few open source optical character recognition systems (Boxoft OCR, ABBY, Tesseract, Free Online OCR etc.) on a set of diverse documents and makes a comparative study of the same. By this, we can study which OCR is the best suited for a document.

Download Full-text