scholarly journals Javanese Character Feature Extraction Based on Shape Energy

2017 ◽  
Vol 5 (1) ◽  
pp. 154-169 ◽  
Author(s):  
Galih Hendra Wibowo ◽  
Riyanto Sigit ◽  
Aliridho Barakbah

Javanese character is one of Indonesia's noble culture, especially in Java. However, the number of Javanese people who are able to read the letter has decreased so that there need to be conservation efforts in the form of a system that is able to recognize the characters. One solution to these problem lies in Optical Character Recognition (OCR) studies, where one of its heaviest points lies in feature extraction which is to distinguish each character. Shape Energy is one of feature extraction method with the basic idea of how the character can be distinguished simply through its skeleton. Based on the basic idea, then the development of feature extraction is done based on its components to produce an angular histogram with various variations of multiples angle. Furthermore, the performance test of this method and its basic method is performed in Javanese character dataset, which has been obtained from various images, is 240 data with 19 labels by using K-Nearest Neighbors as its classification method. Performance values were obtained based on the accuracy which is generated through the Cross-Validation process of 80.83% in the angular histogram with an angle of 20 degrees, 23% better than Shape Energy. In addition, other test results show that this method is able to recognize rotated character with the lowest performance value of 86% at 180-degree rotation and the highest performance value of 96.97% at 90-degree rotation. It can be concluded that this method is able to improve the performance of Shape Energy in the form of recognition of Javanese characters as well as robust to the rotation.

Author(s):  
Htwe Pa Pa Win ◽  
Phyo Thu Thu Khine ◽  
Khin Nwe Ni Tun

This paper proposes a new feature extraction method for off-line recognition of Myanmar printed documents. One of the most important factors to achieve high recognition performance in Optical Character Recognition (OCR) system is the selection of the feature extraction methods. Different types of existing OCR systems used various feature extraction methods because of the diversity of the scripts’ natures. One major contribution of the work in this paper is the design of logically rigorous coding based features. To show the effectiveness of the proposed method, this paper assumed the documents are successfully segmented into characters and extracted features from these isolated Myanmar characters. These features are extracted using structural analysis of the Myanmar scripts. The experimental results have been carried out using the Support Vector Machine (SVM) classifier and compare the pervious proposed feature extraction method.


2017 ◽  
Vol 17 (02) ◽  
pp. 1750012 ◽  
Author(s):  
Mohammad Javad Parseh ◽  
Mojtaba Meftahi

Feature extraction is one of the most important steps in Optical Character Recognition (OCR) systems, that is effective in recognition accuracy. In this paper, a suitable combination of different features such as zoning, hole size, crossing counts, etc. for Persian handwritten digits recognition is proposed. Due to high number of features, feature vector dimensions will be high that increases training time exponentially. In this paper, to solve this problem, Principal Component Analysis (PCA) method is employed for reducing the feature vector dimensions. Finally, data are classified by Support Vector Machine (SVM) classification method. The proposed method has been executed on HODA dataset which is one of the largest standard datasets of Persian handwritten digits that includes 60[Formula: see text]000 training and 20[Formula: see text]000 test samples. The proposed method reaches to 99.07% of accuracy in this dataset, and the experimental results show significant improvement in accuracy of Persian handwritten OCR compared to the previous methods.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Juanjuan Huang ◽  
Ihtisham Ul Haq ◽  
Chaolan Dai ◽  
Sulaiman Khan ◽  
Shah Nazir ◽  
...  

Handwritten text recognition is considered as the most challenging task for the research community due to slight change in different characters’ shape in handwritten documents. The unavailability of a standard dataset makes it vaguer in nature for the researchers to work on. To address these problems, this paper presents an optical character recognition system for the recognition of offline Pashto characters. The problem of the unavailability of a standard handwritten Pashto characters database is addressed by developing a medium-sized database of offline Pashto characters. This database consists of 11352 character images (258 samples for each 44 characters in a Pashto script). Enriched feature extraction techniques of histogram of oriented gradients and zoning-based density features are used for feature extraction of carved Pashto characters. K-nearest neighbors is considered as a classification tool for the proposed algorithm based on the proposed feature sets. A resultant accuracy of 80.34% is calculated for the histogram of oriented gradients, while for zoning-based density features, 76.42% is achieved using 10-fold cross validation.


2018 ◽  
Author(s):  
I Wayan Agus Surya Darma

Balinese character recognition is a technique to recognize feature or pattern of Balinese character. Feature of Balinese character is generated through feature extraction process. This research using handwritten Balinese character. Feature extraction is a process to obtain the feature of character. In this research, feature extraction process generated semantic and direction feature of handwritten Balinese character. Recognition is using K-Nearest Neighbor algorithm to recognize 81 handwritten Balinese character. The feature of Balinese character images tester are compared with reference features. Result of the recognition system with K=3 and reference=10 is achieved a success rate of 97,53%.


2020 ◽  
Vol 14 (4) ◽  
pp. 445-453
Author(s):  
Qian Fan ◽  
Yiqun Zhu

AbstractIn order to solve the problem that the moving span of basic local mean decomposition (LMD) method is difficult to choose reasonably, an improved LMD method (ILMD), which uses three cubic spline interpolation to replace the sliding average, is proposed. On this basis, with the help of noise aided calculation, an ensemble improved LMD method (EILMD) is proposed to effectively solve the modal aliasing problem in original LMD. On the basis of using EILMD to effectively decompose the data of GNSS deformation monitoring series, GNSS deformation feature extraction model based on EILMD threshold denoising is given by means of wavelet soft threshold processing mode and threshold setting method in empirical mode decomposition denoising. Through the analysis of simulated data and the actual GNSS monitoring data in the mining area, the results show that denoising effect of the proposed method is better than EILMD, ILMD and LMD direct coercive denoising methods. It is also better than wavelet analysis denoising method, and has good adaptability. This fully demonstrates the feasibility and effectiveness of the proposed method in GNSS feature extraction.


2020 ◽  
Vol 5 (1) ◽  
pp. 5-9
Author(s):  
Chandra Ramadhan Atmaja Perdana ◽  
Hanung Adi Nugroho ◽  
Igi Ardiyanto

File scanned documents are commonly used in this digital era. Text and image extraction of scanned documents play an important role in acquiring information. A document may contain both texts and images. A combination of text-image classification has been previously investigated. The dataset used for those research works the text were digitally provided. In this research, we used a dataset of high school diploma certificate, which the text must be acquired using optical character recognition (OCR) method. There were two categories for this high school diploma certificate, each category has three classes. We used convolutional neural network for both text and image classifications. We then combined those two models by using adaptive fusion model and weight fusion model to find the best fusion model. We come into conclusion that the performance of weight fusion model which is 0.927 is better than that of adaptive fusion model with 0.892.


Author(s):  
N. Shobha Rani ◽  
Sanjay Kumar Verma ◽  
Anitta Joseph

Realization of high accuracies and efficiencies in South Indian character recognition systems is one of the principle goals to be attempted time after time so as to promote the usage of optical character recognition (OCR) for South Indian languages like Telugu. The process of character recognition comprises pre-processing, segmentation, feature extraction, classification and recognition. The feature extraction stage is meant for uniquely recognizing each character image for the purpose of classifying it. The selection of a feature extraction algorithm is very critical and important for any image processing application and mostly of the times it is directly proportional to the type of the image objects that we have to identify. For optical technologies like South Indian OCR, the feature extraction technique plays a very vital role in accuracy of recognition due to the huge character sets. In this work we mainly focus on evaluating the performance of various feature extraction techniques with respect to Telugu character recognition systems and analyze its efficiencies and accuracies in recognition of Telugu character set.


2020 ◽  
Vol 2020 ◽  
pp. 1-10 ◽  
Author(s):  
Feng Miao ◽  
Rongzhen Zhao ◽  
Xianli Wang

In order to solve the problem of blind separation of signals from dynamic hybrid rotor systems, this paper proposed an improved adaptive inertial weight particle swarm optimization method based on genetic mechanism. The method takes the negative entropy of separated signal as the objective function and adaptively adjusts the inertia weight according to the difference of particle fitness, thus reducing the number of invalid iterations. At the same time, genetic hybridization mechanism was introduced to increase population diversity and facilitate the processing of dynamic mixed signals. The orthogonal matrix is expressed as a parameterized form, which can reduce the complexity of the algorithm. The simulation results showed that the performance of the proposed method is better than that of the traditional method for blind separation of dynamic hybrid analog mechanical signals. It can separate the actual dynamic rotor system signals and achieve the purpose of fault feature extraction.


Sign in / Sign up

Export Citation Format

Share Document