Javanese Character Feature Extraction Based on Shape Energy

Javanese character is one of Indonesia's noble culture, especially in Java. However, the number of Javanese people who are able to read the letter has decreased so that there need to be conservation efforts in the form of a system that is able to recognize the characters. One solution to these problem lies in Optical Character Recognition (OCR) studies, where one of its heaviest points lies in feature extraction which is to distinguish each character. Shape Energy is one of feature extraction method with the basic idea of how the character can be distinguished simply through its skeleton. Based on the basic idea, then the development of feature extraction is done based on its components to produce an angular histogram with various variations of multiples angle. Furthermore, the performance test of this method and its basic method is performed in Javanese character dataset, which has been obtained from various images, is 240 data with 19 labels by using K-Nearest Neighbors as its classification method. Performance values were obtained based on the accuracy which is generated through the Cross-Validation process of 80.83% in the angular histogram with an angle of 20 degrees, 23% better than Shape Energy. In addition, other test results show that this method is able to recognize rotated character with the lowest performance value of 86% at 180-degree rotation and the highest performance value of 96.97% at 90-degree rotation. It can be concluded that this method is able to improve the performance of Shape Energy in the form of recognition of Javanese characters as well as robust to the rotation.

Download Full-text

A Structural Analysis Based Feature Extraction Method for OCR System For Myanmar Printed Document Images

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2012010102 ◽

2012 ◽

Vol 2 (1) ◽

pp. 16-41 ◽

Cited By ~ 1

Author(s):

Htwe Pa Pa Win ◽

Phyo Thu Thu Khine ◽

Khin Nwe Ni Tun

Keyword(s):

Feature Extraction ◽

Structural Analysis ◽

Character Recognition ◽

Optical Character Recognition ◽

Extraction Method ◽

Recognition Performance ◽

Extraction Methods ◽

Support Vector ◽

Svm Classifier ◽

Feature Extraction Method

This paper proposes a new feature extraction method for off-line recognition of Myanmar printed documents. One of the most important factors to achieve high recognition performance in Optical Character Recognition (OCR) system is the selection of the feature extraction methods. Different types of existing OCR systems used various feature extraction methods because of the diversity of the scripts’ natures. One major contribution of the work in this paper is the design of logically rigorous coding based features. To show the effectiveness of the proposed method, this paper assumed the documents are successfully segmented into characters and extracted features from these isolated Myanmar characters. These features are extracted using structural analysis of the Myanmar scripts. The experimental results have been carried out using the Support Vector Machine (SVM) classifier and compare the pervious proposed feature extraction method.

Download Full-text

Neural-Based Hit-Count Feature Extraction Method for Telugu Script Optical Character Recognition

Lecture Notes in Networks and Systems - Innovations in Electronics and Communication Engineering ◽

10.1007/978-981-10-8204-7_48 ◽

2018 ◽

pp. 479-486 ◽

Cited By ~ 1

Author(s):

M. Swamy Das ◽

Kovvur Ram Mohan Rao ◽

P. Balaji

Keyword(s):

Feature Extraction ◽

Character Recognition ◽

Optical Character Recognition ◽

Extraction Method ◽

Feature Extraction Method ◽

Optical Character

Download Full-text

A New Combined Feature Extraction Method for Persian Handwritten Digit Recognition

International Journal of Image and Graphics ◽

10.1142/s0219467817500127 ◽

2017 ◽

Vol 17 (02) ◽

pp. 1750012 ◽

Cited By ~ 2

Author(s):

Mohammad Javad Parseh ◽

Mojtaba Meftahi

Keyword(s):

Feature Extraction ◽

Character Recognition ◽

Optical Character Recognition ◽

Feature Vector ◽

Principal Component ◽

Support Vector ◽

Training Time ◽

Feature Extraction Method ◽

Pca Method ◽

Suitable Combination

Feature extraction is one of the most important steps in Optical Character Recognition (OCR) systems, that is effective in recognition accuracy. In this paper, a suitable combination of different features such as zoning, hole size, crossing counts, etc. for Persian handwritten digits recognition is proposed. Due to high number of features, feature vector dimensions will be high that increases training time exponentially. In this paper, to solve this problem, Principal Component Analysis (PCA) method is employed for reducing the feature vector dimensions. Finally, data are classified by Support Vector Machine (SVM) classification method. The proposed method has been executed on HODA dataset which is one of the largest standard datasets of Persian handwritten digits that includes 60[Formula: see text]000 training and 20[Formula: see text]000 test samples. The proposed method reaches to 99.07% of accuracy in this dataset, and the experimental results show significant improvement in accuracy of Persian handwritten OCR compared to the previous methods.

Download Full-text

Isolated Handwritten Pashto Character Recognition Using a K-NN Classification Tool based on Zoning and HOG Feature Extraction Techniques

Complexity ◽

10.1155/2021/5558373 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Juanjuan Huang ◽

Ihtisham Ul Haq ◽

Chaolan Dai ◽

Sulaiman Khan ◽

Shah Nazir ◽

...

Keyword(s):

Feature Extraction ◽

Character Recognition ◽

Optical Character Recognition ◽

Recognition System ◽

Histogram Of Oriented Gradients ◽

Extraction Techniques ◽

K Nearest Neighbors ◽

Handwritten Documents ◽

Handwritten Text Recognition ◽

Classification Tool

Handwritten text recognition is considered as the most challenging task for the research community due to slight change in different characters’ shape in handwritten documents. The unavailability of a standard dataset makes it vaguer in nature for the researchers to work on. To address these problems, this paper presents an optical character recognition system for the recognition of offline Pashto characters. The problem of the unavailability of a standard handwritten Pashto characters database is addressed by developing a medium-sized database of offline Pashto characters. This database consists of 11352 character images (258 samples for each 44 characters in a Pashto script). Enriched feature extraction techniques of histogram of oriented gradients and zoning-based density features are used for feature extraction of carved Pashto characters. K-nearest neighbors is considered as a classification tool for the proposed algorithm based on the proposed feature sets. A resultant accuracy of 80.34% is calculated for the histogram of oriented gradients, while for zoning-based density features, 76.42% is achieved using 10-fold cross validation.

Download Full-text

Handwritten Balinesse Character Recognition using K-Nearest Neighbor

10.31227/osf.io/z6m8u ◽

2018 ◽

Author(s):

I Wayan Agus Surya Darma

Keyword(s):

Feature Extraction ◽

Success Rate ◽

Character Recognition ◽

Nearest Neighbor ◽

Recognition System ◽

Extraction Process ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm ◽

Character Feature

Balinese character recognition is a technique to recognize feature or pattern of Balinese character. Feature of Balinese character is generated through feature extraction process. This research using handwritten Balinese character. Feature extraction is a process to obtain the feature of character. In this research, feature extraction process generated semantic and direction feature of handwritten Balinese character. Recognition is using K-Nearest Neighbor algorithm to recognize 81 handwritten Balinese character. The feature of Balinese character images tester are compared with reference features. Result of the recognition system with K=3 and reference=10 is achieved a success rate of 97,53%.

Download Full-text

A novel GNSS deformation feature extraction method based on ensemble improved LMD threshold denoising

Journal of Applied Geodesy ◽

10.1515/jag-2020-0039 ◽

2020 ◽

Vol 14 (4) ◽

pp. 445-453

Author(s):

Qian Fan ◽

Yiqun Zhu

Keyword(s):

Feature Extraction ◽

Spline Interpolation ◽

Simulated Data ◽

Mining Area ◽

Deformation Monitoring ◽

Denoising Method ◽

Feature Extraction Method ◽

Deformation Feature ◽

Mode Decomposition ◽

Better Than

AbstractIn order to solve the problem that the moving span of basic local mean decomposition (LMD) method is difficult to choose reasonably, an improved LMD method (ILMD), which uses three cubic spline interpolation to replace the sliding average, is proposed. On this basis, with the help of noise aided calculation, an ensemble improved LMD method (EILMD) is proposed to effectively solve the modal aliasing problem in original LMD. On the basis of using EILMD to effectively decompose the data of GNSS deformation monitoring series, GNSS deformation feature extraction model based on EILMD threshold denoising is given by means of wavelet soft threshold processing mode and threshold setting method in empirical mode decomposition denoising. Through the analysis of simulated data and the actual GNSS monitoring data in the mining area, the results show that denoising effect of the proposed method is better than EILMD, ILMD and LMD direct coercive denoising methods. It is also better than wavelet analysis denoising method, and has good adaptability. This fully demonstrates the feasibility and effectiveness of the proposed method in GNSS feature extraction.

Download Full-text

Comparison of text-image fusion models for high school diploma certificate classification

Communications in Science and Technology ◽

10.21924/cst.5.1.2020.172 ◽

2020 ◽

Vol 5 (1) ◽

pp. 5-9

Author(s):

Chandra Ramadhan Atmaja Perdana ◽

Hanung Adi Nugroho ◽

Igi Ardiyanto

Keyword(s):

High School ◽

Character Recognition ◽

Optical Character Recognition ◽

High School Diploma ◽

Fusion Model ◽

Text And Image ◽

Digital Era ◽

Adaptive Fusion ◽

Scanned Documents ◽

Better Than

File scanned documents are commonly used in this digital era. Text and image extraction of scanned documents play an important role in acquiring information. A document may contain both texts and images. A combination of text-image classification has been previously investigated. The dataset used for those research works the text were digitally provided. In this research, we used a dataset of high school diploma certificate, which the text must be acquired using optical character recognition (OCR) method. There were two categories for this high school diploma certificate, each category has three classes. We used convolutional neural network for both text and image classifications. We then combined those two models by using adaptive fusion model and weight fusion model to find the best fusion model. We come into conclusion that the performance of weight fusion model which is 0.927 is better than that of adaptive fusion model with 0.892.

Download Full-text

Robust Character Recognition Using Adaptive Feature Extraction Method

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e93.d.125 ◽

2010 ◽

Vol E93-D (1) ◽

pp. 125-133

Author(s):

Minoru MORI ◽

Minako SAWAKI ◽

Junji YAMATO

Keyword(s):

Feature Extraction ◽

Character Recognition ◽

Extraction Method ◽

Feature Extraction Method ◽

Adaptive Feature Extraction

Download Full-text

A Zone Based Approach for Classification and Recognition Of Telugu Handwritten Characters

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i4.10553 ◽

2016 ◽

Vol 6 (4) ◽

pp. 1647

Author(s):

N. Shobha Rani ◽

Sanjay Kumar Verma ◽

Anitta Joseph

Keyword(s):

Feature Extraction ◽

Character Recognition ◽

Optical Character Recognition ◽

Vital Role ◽

Indian Languages ◽

Processing Application ◽

Image Objects ◽

South Indian ◽

Recognition Systems ◽

The Times

Realization of high accuracies and efficiencies in South Indian character recognition systems is one of the principle goals to be attempted time after time so as to promote the usage of optical character recognition (OCR) for South Indian languages like Telugu. The process of character recognition comprises pre-processing, segmentation, feature extraction, classification and recognition. The feature extraction stage is meant for uniquely recognizing each character image for the purpose of classifying it. The selection of a feature extraction algorithm is very critical and important for any image processing application and mostly of the times it is directly proportional to the type of the image objects that we have to identify. For optical technologies like South Indian OCR, the feature extraction technique plays a very vital role in accuracy of recognition due to the huge character sets. In this work we mainly focus on evaluating the performance of various feature extraction techniques with respect to Telugu character recognition systems and analyze its efficiencies and accuracies in recognition of Telugu character set.

Download Full-text

Research on the Fault Feature Extraction Method of Rotor Systems Based on GAW-PSO

Mathematical Problems in Engineering ◽

10.1155/2020/9296720 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10 ◽

Cited By ~ 2

Author(s):

Feng Miao ◽

Rongzhen Zhao ◽

Xianli Wang

Keyword(s):

Feature Extraction ◽

Optimization Method ◽

Population Diversity ◽

Blind Separation ◽

Feature Extraction Method ◽

Rotor Systems ◽

The Difference ◽

Dynamic Hybrid ◽

Fault Feature Extraction ◽

Better Than

In order to solve the problem of blind separation of signals from dynamic hybrid rotor systems, this paper proposed an improved adaptive inertial weight particle swarm optimization method based on genetic mechanism. The method takes the negative entropy of separated signal as the objective function and adaptively adjusts the inertia weight according to the difference of particle fitness, thus reducing the number of invalid iterations. At the same time, genetic hybridization mechanism was introduced to increase population diversity and facilitate the processing of dynamic mixed signals. The orthogonal matrix is expressed as a parameterized form, which can reduce the complexity of the algorithm. The simulation results showed that the performance of the proposed method is better than that of the traditional method for blind separation of dynamic hybrid analog mechanical signals. It can separate the actual dynamic rotor system signals and achieve the purpose of fault feature extraction.

Download Full-text