Persian Handwritten Number Recognition Using Adapted Framing Feature and Support Vector Machines

Author(s):  
Hedieh Sajedi ◽  
Mehran Bahador

In this paper, a new approach for segmentation and recognition of Persian handwritten numbers is presented. This method utilizes the framing feature technique in combination with outer profile feature that we named this the adapted framing feature. In our proposed approach, segmentation of the numbers into digits has been carried out automatically. In the classification stage of the proposed method, Support Vector Machines (SVM) and k-Nearest Neighbors (k-NN) are used. Experimentations are conducted on the IFHCDB database consisting 17,740 numeral images and HODA database consisting 102,352 numeral images. In isolated digit level on IFHCDB, the recognition rate of 99.27%, is achieved by using SVM with polynomial kernel. Furthermore, in isolated digit level on HODA, the recognition rate of 99.07% is achieved by using SVM with polynomial kernel. The experiments illustrate that applying our proposed method resulted higher accuracy compared to previous researches.

Author(s):  
KWANG IN KIM ◽  
JIN HYUNG KIM ◽  
KEECHUL JUNG

This paper presents a real-time face recognition system. For the system to be real time, no external time-consuming feature extraction method is used, rather the gray-level values of the raw pixels that make up the face pattern are fed directly to the recognizer. In order to absorb the resulting high dimensionality of the input space, support vector machines (SVMs), which are known to work well even in high-dimensional space, are used as the face recognizer. Furthermore, a modified form of polynomial kernel (local correlation kernel) is utilized to take account of prior knowledge about facial structures and is used as the alternative feature extractor. Since SVMs were originally developed for two-class classification, their basic scheme is extended for multiface recognition by adopting one-per-class decomposition. In order to make a final classification from several one-per-class SVM outputs, a neural network (NN) is used as the arbitrator. Experiments with ORL database show a recognition rate of 97.9% and speed of 0.22 seconds per face with 40 classes.


2021 ◽  
pp. 1-29
Author(s):  
Ahmed Alsaihati ◽  
Mahmoud Abughaban ◽  
Salaheldin Elkatatny ◽  
Abdulazeez Abdulraheem

Abstract Fluid loss into formations is a common operational issue that is frequently encountered when drilling across naturally or induced fractured formations. This could pose significant operational risks, such as well-control, stuck pipe, and wellbore instability, which, in turn, lead to an increase of well time and cost. This research aims to use and evaluate different machine learning techniques, namely: support vector machines, random forests, and K-nearest neighbors in detecting loss circulation occurrences while drilling using solely drilling surface parameters. Actual field data of seven wells, which had suffered partial or severe loss circulation, were used to build predictive models, while Well-8 was used to compare the performance of the developed models. Different performance metrics were used to evaluate the performance of the developed models. Recall, precision, and F1-score measures were used to evaluate the ability of the developed model to detect loss circulation occurrences. The results showed the K-nearest neighbors classifier achieved a high F1-score of 0.912 in detecting loss circulation occurrence in the testing set, while the random forests was the second-best classifier with almost the same F1-score of 0.910. The support vector machines achieved an F1-score of 0.83 in predicting the loss circulation occurrence in the testing set. The K-nearest neighbors outperformed other models in detecting the loss circulation occurrences in Well-8 with an F1-score of 0.80. The main contribution of this research as compared to previous studies is that it identifies losses events based on real-time measurements of the active pit volume.


2018 ◽  
Vol 7 (1) ◽  
pp. 9-16
Author(s):  
Selvia Lorena Br Ginting ◽  
Aldi Azhar Permana

Riset ini dilakukan dengan maksud membangun aplikasi yang dapat manganalisis data nasabah bank kemudian menentukan kelayakan nasabah tersebut dalam hal pemberian pinjaman, agar terhindar dari masalah kredit macet dikemudian hari. Metode yang digunakan adalah metode hybrid yang menggabungkan 2 teknik klasifikasi Data Mining yaitu Support Vector Machines (SVM) dan K-Nearest Neighbors (KNN). SVM bekerja dengan cara menemukan hyperplane yang optimal dan support vector. Lebih lanjut, algoritma KNN akan melakukan klasifikasi data nasabah bank berdasarkan pengidentifikasian support vector tersebut. Dengan 2000 data latih dan 103 data uji: nilai parameter cost=0,1, gamma=2, sistem mengidentifikasi 1998 support vector, kemudian dengan nilai K=16 sistem memberikan hasil 88,35% data yang cocok (91 data dari 103). Dapat disimpulkan bahwa aplikasi ini bekerja dengan cukup baik dan dapat membantu credit analyst dalam merekomendasikan nasabah yang layak memperoleh pinjaman. Kata Kunci - aplikasi; data mining; klasifikasi; metode hybrid; SVM-KNN  


Author(s):  
Srinivas Tennety ◽  
Manish Kumar

In this paper enhancements have been done to the previously proposed Support Vector Machines (SVM) based path planning algorithm [1]. SVM, a data classification technique has been applied in conjunction with k-Nearest Neighbors (k-NN) algorithm for autonomous navigation in unknown road-like environments. The features of the road such as lane markers and the obstacles in the robot’s visibility are divided into two classes using k-nearest neighbors (k-NN) algorithm and then a maximum margin hyperplane is obtained using SVM that optimally separates both the classes. This hyperplane represents a collision-free path. The proposed algorithm has been tested in a variety of environments and the scenarios where it was unsuccessful have been identified and addressed for improvement. The simulation results of the enhanced algorithm have been presented and its performance has been compared with vfh+, a purely local obstacle avoidance algorithm based on artificial potential field method.


10.29007/h71z ◽  
2020 ◽  
Author(s):  
Waleed Almutairi ◽  
Ryszard Janicki

The paper deals with problems that imbalanced and overlapping datasets often en- counter. Performance indicators as accuracy, precision and recall of imbalanced data sets, both with and without overlapping, are discussed and compared with the same performance indicators of balanced datasets with overlapping. Three popular classification algorithms, namely, Decision Tree, KNN (k-Nearest Neighbors) and SVM (Support Vector Machines) classifiers are analyzed and compared.


2021 ◽  
Vol 2 (3) ◽  
pp. 427-437
Author(s):  
Jatmiko Indriyanto ◽  
Miftakhul Huda ◽  
Ida Afriliana

Tujuan penelitian ini adalah untuk Pengembangan Algoritma C4.5 Berbasis Particle Swarm Optimization Untuk Penentuan Kelayakan Asuransi. metode logistic regresion, decision trees, k-nearest neighbors, naïve bayes dan support vector machines.  Model tersebut akan menentukan atau memprediksi status konsumen dimasa mendatang. Observasi yang mirip juga pernah dilakukan, tetapi dengan cara berbeda. Pada penilitian ini, akan digunakan algoritma klasifikasi C4.5 berbasis Particle Swarm Optimization (PSO), hasil ketepatan yang diinginkan lebih bagus dibandingkan hanya memakai algoritma C4.5 untuk mengatasi permasalahan pada kasus pemilihan produk asuransi. dapat disimpulkan bahwa nilai akurasi yang didapatkan pada model algoritma C4.5 berbasis PSO adalah 98.93% lebih bagus jika dibandingkan dengan model algoritma C4.5 yaitu 97.84%. Dari hasil tersebut didapatkan perbedaan antara kedua model yaitu senilai 0.4%. Selagi untuk penelaahan menggunakan ROC curve bagi kedua model ialah,untuk model algoritma C4.5 nilai AUC adalah 0.970 dengan urutan diagnosa  Excellent Classification, dan untuk model algoritma C4.5 berbasis PSO nilai AUC adalah  0.968 dengan urutan diagnosa Excellent Classification.


Linguamática ◽  
2019 ◽  
Vol 11 (1) ◽  
pp. 41-53
Author(s):  
Alessandra Harumi Iriguti ◽  
Valéria Delisandra Feltrim

A classificação de estrutura retórica é uma tarefa de PLN na qual se busca identificar os componentes retóricos de um discurso e seus relacionamentos. No caso deste trabalho, buscou-se identificar automaticamente categorias em nível de sentenças que compõem a estrutura retórica de resumos científicos. Especificamente, o objetivo foi avaliar o impacto de diferentes conjuntos de atributos na implementação de classificadores retóricos para resumos científicos escritos em português. Para isso, foram utilizados atributos superficiais (extraídos como valores TF-IDF e selecionados com o teste chi-quadrado), atributos morfossintáticos (implementados pelo classificador AZPort) e atributos extraídos a partir de modelos de word embeddings (Word2Vec, Wang2Vec e GloVe, todos previamente treinados). Tais conjuntos de atributos, bem como as suas combinações, foram usados para o treinamento de classificadores usando os seguintes algoritmos de aprendizado supervisionado: Support Vector Machines, Naive Bayes, K-Nearest Neighbors, Decision Trees e Conditional Random Fields (CRF). Os classificadores foram avaliados por meio de validação cruzada sobre três corpora compostos por resumos de teses e dissertações. O melhor resultado, 94% de F1, foi obtido pelo classificador CRF com as seguintes combinações de atributos: (i) Wang2Vec--Skip-gram de dimensões 100 com os atributos provenientes do AZPort; (ii) Wang2Vec--Skip-gram e GloVe de dimensão 300 com os atributos do AZPort; (iii) TF-IDF, AZPort e embeddings extraídos com os modelos Wang2Vec--Skip-gram de dimensões 100 e 300 e GloVe de dimensão 300. A partir dos resultados obtidos, conclui-se que os atributos provenientes do classificador AZPort foram fundamentais para o bom desempenho do classificador CRF, enquanto que a combinação com word embeddings se mostrou válida para a melhoria dos resultados.


2020 ◽  
Vol 17 (5) ◽  
pp. 789-798
Author(s):  
Hichem Rahab ◽  
Abdelhafid Zitouni ◽  
Mahieddine Djoudi

In this paper, we propose our enhanced approach to create a dedicated corpus for Algerian Arabic newspapers comments. The developed approach has to enhance an existing approach by the enrichment of the available corpus and the inclusion of the annotation step by following the Model Annotate Train Test Evaluate Revise (MATTER) approach. A corpus is created by collecting comments from web sites of three well know Algerian newspapers. Three classifiers, support vector machines, naïve Bayes, and k-nearest neighbors, were used for classification of comments into positive and negative classes. To identify the influence of the stemming in the obtained results, the classification was tested with and without stemming. Obtained results show that stemming does not enhance considerably the classification due to the nature of Algerian comments tied to Algerian Arabic Dialect. The promising results constitute a motivation for us to improve our approach especially in dealing with non Arabic sentences, especially Dialectal and French ones


Sign in / Sign up

Export Citation Format

Share Document