Classification Accuracy and Model Selection in k-Nearest Neighbors Classifiers for Data Driven Learning

2017 ◽

pp. 897-914 ◽

Cited By ~ 3

Author(s):

Ahmed.T. Sahlol ◽

Aboul Ella Hassanien

Keyword(s):

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Random Forests ◽

Classification Accuracy ◽

Processing Time ◽

Optimization Algorithms ◽

Nearest Neighbors ◽

Benchmark Dataset ◽

K Nearest Neighbors ◽

Linear Discriminant

There are still many obstacles for achieving high recognition accuracy for Arabic handwritten optical character recognition system, each character has a different shape, as well as the similarities between characters. In this chapter, several feature selection-based bio-inspired optimization algorithms including Bat Algorithm, Grey Wolf Optimization, Whale optimization Algorithm, Particle Swarm Optimization and Genetic Algorithm have been presented and an application of Arabic handwritten characters recognition has been chosen to see their ability and accuracy to recognize Arabic characters. The experiments have been performed using a benchmark dataset, CENPARMI by k-Nearest neighbors, Linear Discriminant Analysis, and random forests. The achieved results show superior results for the selected features when comparing the classification accuracy for the selected features by the optimization algorithms with the whole feature set in terms of the classification accuracy and the processing time. The experiments have been performed using a benchmark dataset, CENPARMI by k-Nearest neighbors, Linear Discriminant Analysis, and random forests. The achieved results show superior results for the selected features when comparing the classification accuracy for the selected features by the optimization algorithms with the whole feature set in terms of the classification accuracy and the processing time.

Download Full-text

Improvement and Comparison of Weighted k Nearest Neighbors Classifiers for Model Selection

Journal of Software Engineering ◽

10.3923/jse.2016.109.118 ◽

2015 ◽

Vol 10 (1) ◽

pp. 109-118 ◽

Cited By ~ 6

Author(s):

Ming Zhao ◽

Jingchao Chen

Keyword(s):

Model Selection ◽

Nearest Neighbors ◽

K Nearest Neighbors

Download Full-text

Model selection for k-nearest neighbors regression using VC bounds

Proceedings of the International Joint Conference on Neural Networks, 2003. ◽

10.1109/ijcnn.2003.1223852 ◽

2004 ◽

Cited By ~ 3

Author(s):

V. Cherkassky ◽

Yunqian Ma ◽

Jun Tang

Keyword(s):

Model Selection ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

Selection For

Download Full-text

Improving an AI-Based Algorithm to Automatically Generate Concept Maps

Computer and Information Science ◽

10.5539/cis.v12n4p72 ◽

2019 ◽

Vol 12 (4) ◽

pp. 72

Author(s):

Sara Alomari ◽

Salha Abdullah

Keyword(s):

Text Analysis ◽

Classification Accuracy ◽

Concept Maps ◽

Concept Map ◽

Nearest Neighbors ◽

Learning Method ◽

K Nearest Neighbors ◽

Intelligent Algorithms ◽

Effective Learning

Concept maps have been used to assist learners as an effective learning method in identifying relationships between information, especially when teaching materials have many topics or concepts. However, making a manual concept map is a long and tedious task. It is time-consuming and demands an intensive effort in reading the full content and reasoning the relationships among concepts. Due to this inefficiency, many studies are carried out to develop intelligent algorithms using several data mining techniques. In this research, the authors aim at improving Text Analysis-Association Rules Mining (TA-ARM) algorithm using the weighted K-nearest neighbors (KNN) algorithm instead of the traditional KNN. The weighted KNN is expected to optimize the classification accuracy, which will, eventually, enhance the quality of the generated concept map.

Download Full-text

A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning

Mathematics ◽

10.3390/math8020286 ◽

2020 ◽

Vol 8 (2) ◽

pp. 286 ◽

Cited By ~ 8

Author(s):

Hamid Saadatfar ◽

Samiyeh Khosravi ◽

Javad Hassannataj Joloudari ◽

Amir Mosavi ◽

Shahaboddin Shamshirband

Keyword(s):

Big Data ◽

Classification Accuracy ◽

Learning Algorithm ◽

Computational Cost ◽

Nearest Neighbors ◽

K Nearest Neighbors ◽

Parametric Classification ◽

Efficient Data ◽

Data Pruning ◽

Selection Of

The K-nearest neighbors (KNN) machine learning algorithm is a well-known non-parametric classification method. However, like other traditional data mining methods, applying it on big data comes with computational challenges. Indeed, KNN determines the class of a new sample based on the class of its nearest neighbors; however, identifying the neighbors in a large amount of data imposes a large computational cost so that it is no longer applicable by a single computing machine. One of the proposed techniques to make classification methods applicable on large datasets is pruning. LC-KNN is an improved KNN method which first clusters the data into some smaller partitions using the K-means clustering method; and then applies the KNN for each new sample on the partition which its center is the nearest one. However, because the clusters have different shapes and densities, selection of the appropriate cluster is a challenge. In this paper, an approach has been proposed to improve the pruning phase of the LC-KNN method by taking into account these factors. The proposed approach helps to choose a more appropriate cluster of data for looking for the neighbors, thus, increasing the classification accuracy. The performance of the proposed approach is evaluated on different real datasets. The experimental results show the effectiveness of the proposed approach and its higher classification accuracy and lower time cost in comparison to other recent relevant methods.

Download Full-text

Real Time Efficient Accident Predictor System using Machine Learning Techniques (kNN, RF, LR, DT)

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.d6910.1210220 ◽

2020 ◽

Vol 10 (2) ◽

pp. 108-111

Keyword(s):

Machine Learning ◽

Random Forest ◽

Real Time ◽

Classification Accuracy ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Classification Methods ◽

K Nearest Neighbors ◽

Learning Techniques

Real time crash predictor system is determining frequency of crashes and also severity of crashes. Nowadays machine learning based methods are used to predict the total number of crashes. In this project, prediction accuracy of machine learning algorithms like Decision tree (DT), K-nearest neighbors (KNN), Random forest (RF), Logistic Regression (LR) are evaluated. Performance analysis of these classification methods are evaluated in terms of accuracy. Dataset included for this project is obtained from 49 states of US and 27 states of India which contains 2.25 million US accident crash records and 1.16 million crash records respectively. Results prove that classification accuracy obtained from Random Forest (RF) is96% compared to other classification methods.

Download Full-text

Assessment of Plant Disease Identification using GLCM and KNN Algorithms

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e5018.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 4900-4904

Keyword(s):

Machine Learning ◽

Classification Accuracy ◽

Plant Disease ◽

Nearest Neighbors ◽

Input Image ◽

Plant Diseases ◽

Disease Detection ◽

Textural Features ◽

K Nearest Neighbors ◽

Disease Identification

One of the significant segments of Indian Economy is Cultivation. Occupation to almost 50% of the nation’s labor force is delivered by Indian cultivation segment. India is recognized to be the world's biggest manufacturer of pulses, rice, wheat, spices and spice harvests. Agronomist's financial progress is contingent on the excellence of the goods that they yield, which depend on on the plant's progress and the harvest they get. Consequently, in ground of cultivation, recognition of disease in plants shows an involved part. Plants are exceedingly disposed to to infections that disturb the progress of the plant which in chance distresses the natural balance of the agronomist. In order to distinguish a plant disease at right preliminary period, usage of automatic disease detection procedure is beneficial. The indications of plant diseases are noticeable in various portions of a plant such as leaves, etc. Physical recognition of plant disease by means of leaf descriptions is a wearisome job. The k-mean clustering procedure is utilized for the segmentation of input images. The GLCM (gray-level co-occurrence matrices) procedure is utilized which excerpts textural features from the input image and implementation of KNN (k-nearest neighbors) algorithm for image classification and produced classification accuracy from 70 to 75% for different inputs. Hence, it is required to develop machine learning based computational methods which will make the process of disease detection and classification using leaf images automatic. .. To advance concert of standing methods machine learning and deep learning algorithms will be utilized for more accurate classification.

Download Full-text

Comparison of Instance Selection and Construction Methods with Various Classifiers

Applied Sciences ◽

10.3390/app10113933 ◽

2020 ◽

Vol 10 (11) ◽

pp. 3933 ◽

Cited By ~ 1

Author(s):

Marcin Blachnik ◽

Mirosław Kordos

Keyword(s):

Classification Accuracy ◽

Nearest Neighbors ◽

General Purpose ◽

Support Vector ◽

Instance Selection ◽

Selection Methods ◽

Training Set ◽

K Nearest Neighbors ◽

Set Size ◽

Construction Methods

Instance selection and construction methods were originally designed to improve the performance of the k-nearest neighbors classifier by increasing its speed and improving the classification accuracy. These goals were achieved by eliminating redundant and noisy samples, thus reducing the size of the training set. In this paper, the performance of instance selection methods is investigated in terms of classification accuracy and reduction of training set size. The classification accuracy of the following classifiers is evaluated: decision trees, random forest, Naive Bayes, linear model, support vector machine and k-nearest neighbors. The obtained results indicate that for the most of the classifiers compressing the training set affects prediction performance and only a small group of instance selection methods can be recommended as a general purpose preprocessing step. These are learning vector quantization based algorithms, along with the Drop2 and Drop3. Other methods are less efficient or provide low compression ratio.

Download Full-text

Improving k Nearest Neighbors and Naïve Bayes Classifiers Through Space Transformations and Model Selection

IEEE Access ◽

10.1109/access.2020.3042453 ◽

2020 ◽

Vol 8 ◽

pp. 221669-221688

Author(s):

Jose Ortiz-Bejar ◽

Eric S. Tellez ◽

Mario Graff ◽

Daniela Moctezuma ◽

Sabino Miranda-Jimenez

Keyword(s):

Model Selection ◽

Naive Bayes ◽

Nearest Neighbors ◽

Naïve Bayes ◽

K Nearest Neighbors

Download Full-text

Noninvasive Blood Pressure Classification Based on Photoplethysmography Using K-Nearest Neighbors Algorithm: A Feasibility Study

Information ◽

10.3390/info11020093 ◽

2020 ◽

Vol 11 (2) ◽

pp. 93 ◽

Cited By ~ 2

Author(s):

Tjahjadi ◽

Ramli

Keyword(s):

Blood Pressure ◽

Deep Learning ◽

Early Detection ◽

Classification Accuracy ◽

Nearest Neighbors ◽

National Committee ◽

Committee Report ◽

Noninvasive Blood Pressure ◽

K Nearest Neighbors

Blood pressure (BP) is an important parameter for the early detection of heart disease because it is associated with symptoms of hypertension or hypotension. A single photoplethysmography (PPG) method for the classification of BP can automatically analyze BP symptoms. Users can immediately know the condition of their BP to ensure early detection. In recent years, deep learning methods have presented outstanding performance in classification applications. However, there are two main problems in deep learning classification methods: classiﬁcation accuracy and time consumption during training. We attempt to address these limitations and propose a method for the classification of BP using the K-nearest neighbors (KNN) algorithm based on PPG. We collected data for 121 subjects from the PPG–BP figshare database. We divided the subjects into three classification levels, namely normotension, prehypertension, and hypertension, according to the BP levels of the Joint National Committee report. The F1 scores of these three classification trials were 100%, 100%, and 90.80%, respectively. Hence, it is validated that the proposed method can achieve improved classification accuracy without additional manual pre-processing of PPG. Our proposed method achieves higher accuracy than convolutional neural networks (deep learning), bagged tree, logistic regression, and AdaBoost tree.

Download Full-text

Classification Accuracy and Model Selection in k-Nearest Neighbors Classifiers for Data Driven Learning

Bio-Inspired Optimization Algorithms for Arabic Handwritten Characters

Improvement and Comparison of Weighted k Nearest Neighbors Classifiers for Model Selection

Model selection for k-nearest neighbors regression using VC bounds

Improving an AI-Based Algorithm to Automatically Generate Concept Maps

A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning

Real Time Efficient Accident Predictor System using Machine Learning Techniques (kNN, RF, LR, DT)

Assessment of Plant Disease Identification using GLCM and KNN Algorithms

Comparison of Instance Selection and Construction Methods with Various Classifiers

Improving k Nearest Neighbors and Naïve Bayes Classifiers Through Space Transformations and Model Selection

Noninvasive Blood Pressure Classification Based on Photoplethysmography Using K-Nearest Neighbors Algorithm: A Feasibility Study

Export Citation Format