Investigating the Performance of Cosine Value and Jensen-Shannon Divergence in the kNN Algorithm

K Nearest Neighbor (kNN) is a commonly-used text categorization algorithm. Previous studies mainly focused on improvements of the algorithm by modifying feature selection and k value selection. This research investigates the possibility to use Jensen-Shannon Divergence as similarity measure in the kNN classifier, and compares the performance, in terms of classification accuracy. The experiment denotes that the kNN algorithm based on Jensen-Shannon Divergence outperforms that based on Cosine value, while the performance is also largely dependent on number of categories and number of documents in a category.

Download Full-text

Pulmonary Acoustic Signal Classification Using Autoregressive Coefficients and k-Nearest Neighbor

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.591.211 ◽

2014 ◽

Vol 591 ◽

pp. 211-214 ◽

Cited By ~ 7

Author(s):

Rajkumar Palaniappan ◽

Kenneth Sundaraj ◽

Sebastian Sundaraj ◽

N. Huliraj ◽

S.S. Revadi ◽

...

Keyword(s):

Classification Accuracy ◽

Bandpass Filter ◽

Nearest Neighbor ◽

Pathological Condition ◽

Confusion Matrix ◽

Acoustic Signals ◽

Signal Classification ◽

K Nearest Neighbor ◽

Knn Classifier ◽

Classifier Performance

— Pulmonary acoustic signals provide important information of the condition of the respiratory system. It can be used to assist medical professionals as an alternative diagnosis tool. In this paper, we intend to discriminate between normal (without any pathological condition), Airway Obstruction (AO) pathology and Interstitial lung disease (ILD) pathology using pulmonary acoustic signals. The proposed method filters the heart sounds and other artifacts using a butterworth bandpass filter and windowed to 256 samples per segment. The autoregressive coefficients (AR coefficients) were extracted as features from the pulmonary acoustic signals. The extracted features are distinguished using k-nearest neighbor (k-nn) classifier. The classifier performance is analysed by using confusion matrix technique. A mean classification accuracy of 96.12% was reported for the proposed method. The performance analysis of the knn classifier using confusion matrix revealed that normal, AO and ILD pathology are classified at 94.36%, 95.18% and 94.68% classification accuracy respectively. The analysis reveals that the proposed method performs better in distinguishing between the normal, AO and ILD.Keywords—Respiratorysound,ARcoefficients,k-nearestneighbor,confusionmatrix

Download Full-text

Diagnosis of colorectal cancer based on imperialist competitive algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189021 ◽

2020 ◽

Vol 39 (4) ◽

pp. 5359-5368

Author(s):

B Ratna Raju ◽

G.N Swamy ◽

K. Padma Raju

Keyword(s):

Colorectal Cancer ◽

Colon Cancer ◽

Feature Selection ◽

Euclidean Distance ◽

Nearest Neighbor ◽

Imperialist Competitive Algorithm ◽

Experimental Results ◽

K Nearest Neighbor ◽

Competitive Algorithm ◽

Knn Classifier

The Colorectal cancer leads to more number of death in recent years. The diagnosis of Colorectal cancer as early is safe to treat the patient. To identify and treat this type of cancer, Colonoscopy is applied commonly. The feature selection based methods are proposed which helps to choose the subset variables and to attain better prediction. An Imperialist Competitive Algorithm (ICA) is proposed which helps to select features in identification of colon cancer and its treatment. Also K-Nearest Neighbor (KNN) classifier is used to retain a minimal Euclidean distance between the feature of query vector and all the data in the nature of prototype training. Experimental results have proved that the proposed method is superior when compared to other methods in its metrics of performance. Better accuracy is achieved by the proposed method.

Download Full-text

Improved Weighted k-Nearest Neighbor Based on PSO for Wind Power System State Recognition

Energies ◽

10.3390/en13205520 ◽

2020 ◽

Vol 13 (20) ◽

pp. 5520

Author(s):

Chun-Yao Lee ◽

Kuan-Yu Huang ◽

Yi-Xing Shen ◽

Yao-Chen Lee

Keyword(s):

Feature Selection ◽

Power System ◽

Wind Power ◽

Classification Accuracy ◽

Nearest Neighbor ◽

Signal To Noise Ratio ◽

Radial Basis Function Network ◽

Distance Judgment ◽

K Nearest Neighbor ◽

Wind Power System

In this paper, we propose using particle swarm optimization (PSO) which can improve weighted k-nearest neighbors (PWKNN) to diagnose the failure of a wind power system. PWKNN adjusts weight to correctly reflect the importance of features and uses the distance judgment strategy to figure out the identical probability of multi-label classification. The PSO optimizes the weight and parameter k of PWKNN. This testing is based on four classified conditions of the 300 W wind generator which include healthy, loss of lubrication in the gearbox, angular misaligned rotor, and bearing fault. Current signals are used to measure the conditions. This testing tends to establish a feature database that makes up or trains classifiers through feature extraction. Not lowering the classification accuracy, the correlation coefficient of feature selection is applied to eliminate irrelevant features and to diminish the runtime of classifiers. A comparison with other traditional classifiers, i.e., backpropagation neural network (BPNN), k-nearest neighbor (k-NN), and radial basis function network (RBFN) shows that PWKNN has a higher classification accuracy. The feature selection can diminish the average features from 16 to 2.8 and can reduce the runtime by 61%. This testing can classify these four conditions accurately without being affected by noise and it can reach an accuracy of 83% in the condition of signal-to-noise ratio (SNR) is 20dB. The results show that the PWKNN approach is capable of diagnosing the failure of a wind power system.

Download Full-text

Feature Selection Approach based on Firefly Algorithm and Chi-square

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v8i4.pp2338-2350 ◽

2018 ◽

Vol 8 (4) ◽

pp. 2338 ◽

Cited By ~ 1

Author(s):

Emad Mohamed Mashhour ◽

Enas M. F. El Houby ◽

Khaled Tawfik Wassif ◽

Akram I. Salah

Keyword(s):

Feature Selection ◽

Discriminant Analysis ◽

Classification Accuracy ◽

Firefly Algorithm ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Chi Square ◽

Fitness Functions ◽

Selection Approach ◽

Feature Selection Approach

Dimensionality problem is a well-known challenging issue for most classifiers in which datasets have unbalanced number of samples and features. Features may contain unreliable data which may lead the classification process to produce undesirable results. Feature selection approach is considered a solution for this kind of problems. In this paperan enhanced firefly algorithm is proposed to serve as a feature selection solution for reducing dimensionality and picking the most informative features to be used in classification. The main purpose of the proposedmodel is to improve the classification accuracy through using the selected features produced from the model, thus classification errors will decrease. Modeling firefly in this research appears through simulating firefly position by cell chi-square value which is changed after every move, and simulating firefly intensity by calculating a set of different fitness functionsas a weight for each feature. K-nearest neighbor and Discriminant analysis are used as classifiers to test the proposed firefly algorithm in selecting features. Experimental results showed that the proposed enhanced algorithmbased on firefly algorithm with chi-square and different fitness functions can provide better results than others. Results showed that reduction of dataset is useful for gaining higher accuracy in classification.

Download Full-text

Feature Selection on K-Nearest Neighbor Algorithm Using Similarity Measure

2020 3rd International Conference on Mechanical, Electronics, Computer, and Industrial Technology (MECnIT) ◽

10.1109/mecnit48290.2020.9166612 ◽

2020 ◽

Author(s):

Ratih Puspadini ◽

Herman Mawengkang ◽

Syahril Efendi

Keyword(s):

Feature Selection ◽

Similarity Measure ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Download Full-text

A Hybrid Classification Approach Based on Support Vector Machine and K-Nearest Neighbor for Remote Sensing Data

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001417500343 ◽

2017 ◽

Vol 31 (10) ◽

pp. 1750034 ◽

Cited By ~ 8

Author(s):

Gulnaz Alimjan ◽

Tieli Sun ◽

Hurxida Jumahun ◽

Yu Guan ◽

Wanting Zhou ◽

...

Keyword(s):

Remote Sensing ◽

Classification Accuracy ◽

Nearest Neighbor ◽

Remote Sensing Data ◽

Support Vector ◽

K Nearest Neighbor ◽

Sensing Data ◽

Knn Classifier ◽

Training Samples ◽

Hybrid Classification

Analysis and classification for remote sensing landscape based on remote sensing imagery is a popular research topic. In this paper, we propose a new remote sensing data classifier by incorporating the support vector machine (SVM) learning information into the K-nearest neighbor (KNN) classifier. The SVM is well known for its extraordinary generalization capability even with limited learning samples, and it is very useful for remote sensing applications as data samples are usually limited. The KNN has been widely used in data classification due to its simplicity and effectiveness. However, the KNN is instance-based and needs to keep all the training samples for classification, which could cause not only high computation complexity but also overfitting problems. Meanwhile, the performance of the KNN classifier is sensitive to the neighborhood size [Formula: see text] and how to select the value of the parameter [Formula: see text] relies heavily on practice and experience. Based on the observations that the SVM can contribute to the KNN on the problems of smaller training samples size as well as the selection of the parameter [Formula: see text], we propose a support vector nearest neighbor (abbreviated as SV-NN) hybrid classification approach which can simplify the parameter selection while maintaining classification accuracy. The proposed approach is consist of two stages. In the first stage, the SVM is performed on the training samples to obtain the reduced support vectors (SVs) for each of the sample categories. In the second stage, a nearest neighbor classifier (NNC) is used to classify a testing sample, i.e. the average Euclidean distance between the testing data point to each set of SVs from different categories is calculated and the NNC identifies the category with minimum distance. To evaluate the effectiveness of the proposed approach, firstly experiments of classification for samples from remote sensing data are evaluated, and then experiments of identifying different land covers regions in the remote sensing images are evaluated. Experimental results show that the SV-NN approach maintains good classification accuracy while reduces the training samples compared with the conventional SVM and KNN classification model.

Download Full-text

Preprocessing of Skin Images and Feature Selection for Early Stage of Melanoma Detection using Color Feature Extraction

International Journal of Artificial Intelligence Research ◽

10.29099/ijair.v4i2.165 ◽

2021 ◽

Vol 4 (2) ◽

pp. 95

Author(s):

Yuita Arum Sari ◽

Anggi Gustiningsih Hapsani ◽

Sigit Adinugroho ◽

Lukman Hakim ◽

Siti Mutrofin

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Nearest Neighbor ◽

Early Stage ◽

Normal Skin ◽

Feature Selection Method ◽

K Nearest Neighbor ◽

Color Feature ◽

K Value ◽

Linear Discriminant

Preprocessing is an essential part to achieve good segmentation since it affects the feature extraction process. Melanoma have various shapes and their extracted features from image are used for early stage detection. Due to the fact that melanoma is one of dangerous diseases, early detection is required to prevent further phase of cancer from developing. In this paper, we propose a new framework to detect cancer on skin images using color feature extraction and feature selection. The default color space of skin images is RGB, then brightness is added to distinguish the normal and darken area on the skin. After that, average filter and histogram equalization are applied as well for attaining a good color intensities which are capable of determining normal skin from suspicious one. Otsu thresholding is utilized afterwards for melanoma segmentation. There are 147 features extracted from segmented images. Those features are reduced using three types of feature selection algorithms: Linear Discriminant Analysis (LDA), Correlation based Feature Selection (CFS), and Relief. All selected features are classified using k-Nearest Neighbor (k-NN). Relief is known to be the best feature selection method among others and the optimal k value is 7 with 10-cross validation with accuracy of 0.835 and 0.845, without and with feature selection respectively. The result indicates that the frameworks is applicable for early skin cancer detection.

Download Full-text

The Classification of Skateboarding Tricks : A Transfer Learning and Machine Learning Approach

Mekatronika ◽

10.15282/mekatronika.v2i2.6683 ◽

2020 ◽

Vol 2 (2) ◽

pp. 1-12

Author(s):

Muhammad Nur Aiman Shapiee ◽

Muhammad Ar Rahim Ibrahim ◽

Muhammad Amirul Abdullah ◽

Rabiu Muazu Musa ◽

Noor Azuan Abu Osman ◽

...

Keyword(s):

Machine Learning ◽

Classification Accuracy ◽

Nearest Neighbor ◽

Olympic Games ◽

Learning Approach ◽

K Nearest Neighbor ◽

Test Dataset ◽

Machine Learning Approach ◽

Competitive Games

The skateboarding scene has arrived at new statures, particularly with its first appearance at the now delayed Tokyo Summer Olympic Games. Hence, attributable to the size of the game in such competitive games, progressed creative appraisal approaches have progressively increased due consideration by pertinent partners, particularly with the enthusiasm of a more goal-based assessment. This study purposes for classifying skateboarding tricks, specifically Frontside 180, Kickflip, Ollie, Nollie Front Shove-it, and Pop Shove-it over the integration of image processing, Trasnfer Learning (TL) to feature extraction enhanced with tradisional Machine Learning (ML) classifier. A male skateboarder performed five tricks every sort of trick consistently and the YI Action camera captured the movement by a range of 1.26 m. Then, the image dataset were features built and extricated by means of three TL models, and afterward in this manner arranged to utilize by k-Nearest Neighbor (k-NN) classifier. The perception via the initial experiments showed, the MobileNet, NASNetMobile, and NASNetLarge coupled with optimized k-NN classifiers attain a classification accuracy (CA) of 95%, 92% and 90%, respectively on the test dataset. Besides, the result evident from the robustness evaluation showed the MobileNet+k-NN pipeline is more robust as it could provide a decent average CA than other pipelines. It would be demonstrated that the suggested study could characterize the skateboard tricks sufficiently and could, over the long haul, uphold judges decided for giving progressively objective-based decision.

Download Full-text