Comparative Study of KNN, SVM and SR Classifiers in Recognizing Arabic Handwritten Characters Employing Feature Fusion

Abu Sayeed Ahsanul Huque; Mainul Haque; Haidar A. Khan; Abdullah Al Helal; Khawza I. Ahmed

doi:10.31763/simple.v1i2.1

Comparative Study of KNN, SVM and SR Classifiers in Recognizing Arabic Handwritten Characters Employing Feature Fusion

Signal and Image Processing Letters ◽

10.31763/simple.v1i2.1 ◽

2019 ◽

Vol 1 (2) ◽

pp. 1-10

Author(s):

Abu Sayeed Ahsanul Huque ◽

Mainul Haque ◽

Haidar A. Khan ◽

Abdullah Al Helal ◽

Khawza I. Ahmed

Keyword(s):

Character Recognition ◽

Feature Fusion ◽

Recognition Accuracy ◽

Binary Image ◽

Support Vector ◽

Gray Scale ◽

Large Dataset ◽

K Nearest Neighbors ◽

Sparse Representation Classifier ◽

Image Gradients

This paper evaluates and compares the performance of K-Nearest Neighbors (KNN), Support Vector Machine (SVM) and Sparse Representation Classifier (SRC) for recognition of isolated Arabic handwritten characters. The proposed framework converts the gray-scale character image to a binary image through Otsu thresholding, and size-normalizes the binary image for feature extraction. Next, we exploit image down-sampling and the histogram of image gradients as features for image classification and apply fusion (combination) of these features to improve the recognition accuracy. The performance of the proposed system is evaluated on Isolated Farsi/Arabic Handwritten Character Database (IFHCDB) – a large dataset containing gray scale character images. Experimental results reveal that the histogram of gradient consistently outperforms down-sampling based features, and the fusion of these two feature sets achieves the best performance. Likewise, SRC and SVM both outperform KNN, with the latter performing the best among the three. Finally, we achieved a commanding accuracy of 93.71% in character recognition with fusion of features classified by SVM, where 92.06% and 91.10% is achieved by SRC and KNN respectively.

Download Full-text

A feature fusion based optical character recognition of Bangla characters using support vector machine

2017 3rd International Conference on Electrical Information and Communication Technology (EICT) ◽

10.1109/eict.2017.8275138 ◽

2017 ◽

Cited By ~ 2

Author(s):

Mst. Tasnim Pervin ◽

Shyla Afroge ◽

Aminul Huq

Keyword(s):

Support Vector Machine ◽

Character Recognition ◽

Optical Character Recognition ◽

Feature Fusion ◽

Support Vector ◽

Optical Character

Download Full-text

A New Feature Fusion Method for Handwritten Character Recognition Based on 3D Accelerometer

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.44-47.1583 ◽

2010 ◽

Vol 44-47 ◽

pp. 1583-1587 ◽

Cited By ~ 1

Author(s):

Zhen Yu He

Keyword(s):

Character Recognition ◽

Feature Fusion ◽

Consumer Electronics ◽

Support Vector ◽

Accelerometer Data ◽

Fusion Method ◽

Handwritten Character Recognition ◽

Handwritten Character ◽

New Feature ◽

Fusion Feature

In this paper, a new feature fusion method for Handwritten Character Recognition based on single tri-axis accelerometer has been proposed. The process can be explained as follows: firstly, the short-time energy (STE) features are extracted from accelerometer data. Secondly, the Frequency-domain feature namely Fast Fourier transform Coefficient (FFT) are also extracted. Finally, these two categories features are fused together and the principal component analysis (PCA) is employed to reduce the dimension of the fusion feature. Recognition of the gestures is performed with Multi-class Support Vector Machine. The average recognition results of ten Arabic numerals using the proposed fusion feature are 84.6%, which are better than only using STE or FFT feature. The performance of experimental results show that gesture-based interaction can be used as a novel human computer interaction for consumer electronics and mobile device.

Download Full-text

Pengenalan Karakter Tulisan Tangan Dengan K-Support Vector Nearest Neighbor

IJEIS (Indonesian Journal of Electronics and Instrumentation Systems) ◽

10.22146/ijeis.38729 ◽

2019 ◽

Vol 9 (1) ◽

pp. 33

Author(s):

Aditya Surya Wijaya ◽

Nurul Chamidah ◽

Mayanda Mega Santoni

Keyword(s):

Character Recognition ◽

Nearest Neighbor ◽

Recognition Accuracy ◽

Training Data ◽

Support Vector ◽

K Nearest Neighbor ◽

Zone Method ◽

Handwritten Character ◽

Testing Data ◽

Handwritten Recognition

Handwritten characters are difficult to be recognized by machine because people had various own writing style. This research recognizes handwritten character pattern of numbers and alphabet using K-Nearest Neighbour (KNN) algorithm. Handwritten recognition process is worked by preprocessing handwritten image, segmentation to obtain separate single characters, feature extraction, and classification. Features extraction is done by utilizing Zone method that will be used for classification by splitting this features data to training data and testing data. Training data from extracted features reduced by K-Support Vector Nearest Neighbor (K-SVNN) and for recognizing handwritten pattern from testing data, we used K-Nearest Neighbor (KNN). Testing result shows that reducing training data using K-SVNN able to improve handwritten character recognition accuracy.

Download Full-text

Comparison of convolutional neural network and bag of features for multi-font digit recognition

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v15.i3.pp1322-1328 ◽

2019 ◽

Vol 15 (3) ◽

pp. 1322

Author(s):

Nasibah Husna Mohd Kadir ◽

Sharifah Nur Syafiqah Mohd Nur Hidayah ◽

Norasiah Mohammad ◽

Zaidah Ibrahim

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Character Recognition ◽

Recognition Accuracy ◽

Recognition Performance ◽

Support Vector ◽

Svm Classifier ◽

Learning Method ◽

Digit Recognition ◽

Bag Of Features

<span>This paper evaluates the recognition performance of Convolutional Neural Network (CNN) and Bag of Features (BoF) for multiple font digit recognition. Font digit recognition is part of character recognition that is used to translate images from many document-input tasks such as handwritten, typewritten and printed text. BoF is a popular machine learning method while CNN is a popular deep learning method. Experiments were performed by applying BoF with Speeded-up Robust Feature (SURF) and Support Vector Machine (SVM) classifier and compared with CNN on Chars74K dataset. The recognition accuracy produced by BoF is just slightly lower than CNN where the accuracy of CNN is 0.96 while the accuracy of BoF is 0.94.</span>

Download Full-text

Optical Character Recognition of Postmark Date Based on Machine Vision

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.424-425.1107 ◽

2012 ◽

Vol 424-425 ◽

pp. 1107-1111

Author(s):

Fu Cheng You ◽

Ying Jie Liu

Keyword(s):

Image Processing ◽

Pattern Recognition ◽

Machine Vision ◽

Information Management ◽

Character Recognition ◽

Optical Character Recognition ◽

Binary Image ◽

Support Vector ◽

Threshold Method ◽

Optical Character

For the purpose of information management on postmark according to the date, the paper put forward a method of postmark date recognition based on machine vision, which could meet the demands of personal postmark collectors. On the basis of the relative theories of machine vision, image processing and pattern recognition, the overall process is introduced in the paper from postmark image acquisition to date recognition. Firstly, threshold method is used to generate binary image from smoothed postmark image. So region of date numbers could be extracted from binary image according to different region features. Then regions of date numbers which are connected or broken could be processed through mathematical morphology of binary image. Individual regions of date numbers are obtained for recognition. Finally, classification and pattern recognition based on support vector machine make date numbers classified and date recognition is implemented correctly

Download Full-text

OFFLINE YORÙBÁ HANDWRITTEN WORD RECOGNITION USING GEOMETRIC FEATURE EXTRACTION AND SUPPORT VECTOR MACHINE CLASSIFIER

MALAYSIAN JOURNAL OF COMPUTING ◽

10.24191/mjoc.v5i2.8947 ◽

2020 ◽

Vol 5 (2) ◽

pp. 504

Author(s):

Matthias Omotayo Oladele ◽

Temilola Morufat Adepoju ◽

Olaide ` Abiodun Olatoke ◽

Oluwaseun Adewale Ojo

Keyword(s):

Support Vector Machine ◽

Feature Extraction ◽

Word Recognition ◽

Support Vector Machine Classifier ◽

Recognition Accuracy ◽

Recognition System ◽

Support Vector ◽

Geometric Features ◽

Total Length ◽

Yoruba Language

Yorùbá language is one of the three main languages that is been spoken in Nigeria. It is a tonal language that carries an accent on the vowel alphabets. There are twenty-five (25) alphabets in Yorùbá language with one of the alphabets a digraph (GB). Due to the difficulty in typing handwritten Yorùbá documents, there is a need to develop a handwritten recognition system that can convert the handwritten texts to digital format. This study discusses the offline Yorùbá handwritten word recognition system (OYHWR) that recognizes Yorùbá uppercase alphabets. Handwritten characters and words were obtained from different writers using the paint application and M708 graphics tablets. The characters were used for training and the words were used for testing. Pre-processing was done on the images and the geometric features of the images were extracted using zoning and gradient-based feature extraction. Geometric features are the different line types that form a particular character such as the vertical, horizontal, and diagonal lines. The geometric features used are the number of horizontal lines, number of vertical lines, number of right diagonal lines, number of left diagonal lines, total length of all horizontal lines, total length of all vertical lines, total length of all right slanting lines, total length of all left-slanting lines and the area of the skeleton. The characters are divided into 9 zones and gradient feature extraction was used to extract the horizontal and vertical components and geometric features in each zone. The words were fed into the support vector machine classifier and the performance was evaluated based on recognition accuracy. Support vector machine is a two-class classifier, hence a multiclass SVM classifier least square support vector machine (LSSVM) was used for word recognition. The one vs one strategy and RBF kernel were used and the recognition accuracy obtained from the tested words ranges between 66.7%, 83.3%, 85.7%, 87.5%, and 100%. The low recognition rate for some of the words could be as a result of the similarity in the extracted features.

Download Full-text

Use of Machine Learning to Investigate the Quantitative Checklist for Autism in Toddlers (Q-CHAT) towards Early Autism Screening

Diagnostics ◽

10.3390/diagnostics11030574 ◽

2021 ◽

Vol 11 (3) ◽

pp. 574

Author(s):

Gennaro Tartarisco ◽

Giovanni Cicceri ◽

Davide Di Pietro ◽

Elisa Leonardi ◽

Stefania Aiello ◽

...

Keyword(s):

Machine Learning ◽

High Performance ◽

Behavioral Science ◽

Autistic Traits ◽

Classification Performance ◽

Recursive Feature Elimination ◽

Diagnostic Tools ◽

Support Vector ◽

K Nearest Neighbors ◽

Autism Screening

In the past two decades, several screening instruments were developed to detect toddlers who may be autistic both in clinical and unselected samples. Among others, the Quantitative CHecklist for Autism in Toddlers (Q-CHAT) is a quantitative and normally distributed measure of autistic traits that demonstrates good psychometric properties in different settings and cultures. Recently, machine learning (ML) has been applied to behavioral science to improve the classification performance of autism screening and diagnostic tools, but mainly in children, adolescents, and adults. In this study, we used ML to investigate the accuracy and reliability of the Q-CHAT in discriminating young autistic children from those without. Five different ML algorithms (random forest (RF), naïve Bayes (NB), support vector machine (SVM), logistic regression (LR), and K-nearest neighbors (KNN)) were applied to investigate the complete set of Q-CHAT items. Our results showed that ML achieved an overall accuracy of 90%, and the SVM was the most effective, being able to classify autism with 95% accuracy. Furthermore, using the SVM–recursive feature elimination (RFE) approach, we selected a subset of 14 items ensuring 91% accuracy, while 83% accuracy was obtained from the 3 best discriminating items in common to ours and the previously reported Q-CHAT-10. This evidence confirms the high performance and cross-cultural validity of the Q-CHAT, and supports the application of ML to create shorter and faster versions of the instrument, maintaining high classification accuracy, to be used as a quick, easy, and high-performance tool in primary-care settings.

Download Full-text

Oestrus Analysis of Sows Based on Bionic Boars and Machine Vision Technology

Animals ◽

10.3390/ani11061485 ◽

2021 ◽

Vol 11 (6) ◽

pp. 1485

Author(s):

Kaidong Lei ◽

Chao Zong ◽

Xiaodong Du ◽

Guanghui Teng ◽

Feiqi Feng

Keyword(s):

Support Vector Machine ◽

Machine Vision ◽

Recognition Accuracy ◽

Average Duration ◽

Support Vector ◽

Contact Duration ◽

Innovative Design ◽

Sparse Autoencoder ◽

Accuracy Rates ◽

Interactive Behaviour

This study proposes a method and device for the intelligent mobile monitoring of oestrus on a sow farm, applied in the field of sow production. A bionic boar model that imitates the sounds, smells, and touch of real boars was built to detect the oestrus of sows after weaning. Machine vision technology was used to identify the interactive behaviour between empty sows and bionic boars and to establish deep belief network (DBN), sparse autoencoder (SAE), and support vector machine (SVM) models, and the resulting recognition accuracy rates were 96.12%, 98.25%, and 90.00%, respectively. The interaction times and frequencies between the sow and the bionic boar and the static behaviours of both ears during heat were further analysed. The results show that there is a strong correlation between the duration of contact between the oestrus sow and the bionic boar and the static behaviours of both ears. The average contact duration between the sows in oestrus and the bionic boars was 29.7 s/3 min, and the average duration in which the ears of the oestrus sows remained static was 41.3 s/3 min. The interactions between the sow and the bionic boar were used as the basis for judging the sow’s oestrus states. In contrast with the methods of other studies, the proposed innovative design for recyclable bionic boars can be used to check emotions, and machine vision technology can be used to quickly identify oestrus behaviours. This approach can more accurately obtain the oestrus duration of a sow and provide a scientific reference for a sow’s conception time.

Download Full-text

Bipartite Network of Interest (BNOI): Extending Co-Word Network with Interest of Researchers Using Sensor Data and Corresponding Applications as an Example

Sensors ◽

10.3390/s21051668 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1668

Author(s):

Zongming Dai ◽

Kai Hu ◽

Jie Xie ◽

Shengyu Shen ◽

Jie Zheng ◽

...

Keyword(s):

Feature Fusion ◽

Extraction Methods ◽

Knowledge Network ◽

Sensor Data ◽

Support Vector ◽

Bipartite Network ◽

Classification Models ◽

Text Data ◽

Domain Experts ◽

Problems And Solutions

Traditional co-word networks do not discriminate keywords of researcher interest from general keywords. Co-word networks are therefore often too general to provide knowledge if interest to domain experts. Inspired by the recent work that uses an automatic method to identify the questions of interest to researchers like “problems” and “solutions”, we try to answer a similar question “what sensors can be used for what kind of applications”, which is great interest in sensor- related fields. By generalizing the specific questions as “questions of interest”, we built a knowledge network considering researcher interest, called bipartite network of interest (BNOI). Different from a co-word approaches using accurate keywords from a list, BNOI uses classification models to find possible entities of interest. A total of nine feature extraction methods including N-grams, Word2Vec, BERT, etc. were used to extract features to train the classification models, including naïve Bayes (NB), support vector machines (SVM) and logistic regression (LR). In addition, a multi-feature fusion strategy and a voting principle (VP) method are applied to assemble the capability of the features and the classification models. Using the abstract text data of 350 remote sensing articles, features are extracted and the models trained. The experiment results show that after removing the biased words and using the ten-fold cross-validation method, the F-measure of “sensors” and “applications” are 93.2% and 85.5%, respectively. It is thus demonstrated that researcher questions of interest can be better answered by the constructed BNOI based on classification results, comparedwith the traditional co-word network approach.

Download Full-text

Student Performance Prediction with Optimum Multilabel Ensemble Model

Journal of Intelligent Systems ◽

10.1515/jisys-2021-0016 ◽

2021 ◽

Vol 30 (1) ◽

pp. 511-523

Author(s):

Ephrem Admasu Yekun ◽

Abrahaley Teklay Haile

Keyword(s):

High School Students ◽

Prediction Model ◽

Student Performance ◽

Performance Prediction ◽

Transformation Method ◽

Classification Task ◽

Support Vector ◽

School Students ◽

K Nearest Neighbors ◽

Classifier Chains

Abstract One of the important measures of quality of education is the performance of students in academic settings. Nowadays, abundant data is stored in educational institutions about students which can help to discover insight on how students are learning and to improve their performance ahead of time using data mining techniques. In this paper, we developed a student performance prediction model that predicts the performance of high school students for the next semester for five courses. We modeled our prediction system as a multi-label classification task and used support vector machine (SVM), Random Forest (RF), K-nearest Neighbors (KNN), and Multi-layer perceptron (MLP) as base-classifiers to train our model. We further improved the performance of the prediction model using a state-of-the-art partitioning scheme to divide the label space into smaller spaces and used Label Powerset (LP) transformation method to transform each labelset into a multi-class classification task. The proposed model achieved better performance in terms of different evaluation metrics when compared to other multi-label learning tasks such as binary relevance and classifier chains.

Download Full-text