Neural-Based Hit-Count Feature Extraction Method for Telugu Script Optical Character Recognition

Author(s):  
M. Swamy Das ◽  
Kovvur Ram Mohan Rao ◽  
P. Balaji
Author(s):  
Htwe Pa Pa Win ◽  
Phyo Thu Thu Khine ◽  
Khin Nwe Ni Tun

This paper proposes a new feature extraction method for off-line recognition of Myanmar printed documents. One of the most important factors to achieve high recognition performance in Optical Character Recognition (OCR) system is the selection of the feature extraction methods. Different types of existing OCR systems used various feature extraction methods because of the diversity of the scripts’ natures. One major contribution of the work in this paper is the design of logically rigorous coding based features. To show the effectiveness of the proposed method, this paper assumed the documents are successfully segmented into characters and extracted features from these isolated Myanmar characters. These features are extracted using structural analysis of the Myanmar scripts. The experimental results have been carried out using the Support Vector Machine (SVM) classifier and compare the pervious proposed feature extraction method.


Author(s):  
FANG-HSUAN CHENG ◽  
WEN-HSING HSU

This paper describes typical research on Chinese optical character recognition in Taiwan. Chinese characters can be represented by a set of basic line segments called strokes. Several approaches to the recognition of handwritten Chinese characters by stroke analysis are described here. A typical optical character recognition (OCR) system consists of four main parts: image preprocessing, feature extraction, radical extraction and matching. Image preprocessing is used to provide the suitable format for data processing. Feature extraction is used to extract stable features from the Chinese character. Radical extraction is used to decompose the Chinese character into radicals. Finally, matching is used to recognize the Chinese character. The reasons for using strokes as the features for Chinese character recognition are the following. First, all Chinese characters can be represented by a combination of strokes. Second, the algorithms developed under the concept of strokes do not have to be modified when the number of characters increases. Therefore, the algorithms described in this paper are suitable for recognizing large sets of Chinese characters.


2017 ◽  
Vol 5 (1) ◽  
pp. 154-169 ◽  
Author(s):  
Galih Hendra Wibowo ◽  
Riyanto Sigit ◽  
Aliridho Barakbah

Javanese character is one of Indonesia's noble culture, especially in Java. However, the number of Javanese people who are able to read the letter has decreased so that there need to be conservation efforts in the form of a system that is able to recognize the characters. One solution to these problem lies in Optical Character Recognition (OCR) studies, where one of its heaviest points lies in feature extraction which is to distinguish each character. Shape Energy is one of feature extraction method with the basic idea of how the character can be distinguished simply through its skeleton. Based on the basic idea, then the development of feature extraction is done based on its components to produce an angular histogram with various variations of multiples angle. Furthermore, the performance test of this method and its basic method is performed in Javanese character dataset, which has been obtained from various images, is 240 data with 19 labels by using K-Nearest Neighbors as its classification method. Performance values were obtained based on the accuracy which is generated through the Cross-Validation process of 80.83% in the angular histogram with an angle of 20 degrees, 23% better than Shape Energy. In addition, other test results show that this method is able to recognize rotated character with the lowest performance value of 86% at 180-degree rotation and the highest performance value of 96.97% at 90-degree rotation. It can be concluded that this method is able to improve the performance of Shape Energy in the form of recognition of Javanese characters as well as robust to the rotation.


Sign in / Sign up

Export Citation Format

Share Document