Accelerating wrapper-based feature selection with K-nearest-neighbor

2015 ◽  
Vol 83 ◽  
pp. 81-91 ◽  
Author(s):  
Aiguo Wang ◽  
Ning An ◽  
Guilin Chen ◽  
Lian Li ◽  
Gil Alterovitz
2010 ◽  
Vol 44-47 ◽  
pp. 1130-1134
Author(s):  
Sheng Li ◽  
Pei Lin Zhang ◽  
Bing Li

Feature selection is a key step in hydraulic system fault diagnosis. Some of the collected features are unrelated to classification model, and some are high correlated to other features. These features are harmful for establishing classification model. In order to solve this problem, genetic algorithm-partial least squares (GA-PLS) is proposed for selecting the representative and optimal features. K nearest neighbor algorithm (KNN) is used for diagnosing and classifying hydraulic system faults. For expressing better performance of GA-PLS, the original data of a model engineering hydraulic system is used, and the results of GA-PLS are compared with all feature used and GA. The experimental results show that, the proposed feature method can diagnose and classify hydraulic system faults more efficiently with using fewer features.


2021 ◽  
Vol 12 (2) ◽  
pp. 85-99
Author(s):  
Nassima Dif ◽  
Zakaria Elberrichi

Hybrid metaheuristics has received a lot of attention lately to solve combinatorial optimization problems. The purpose of hybridization is to create a cooperation between metaheuristics for better solutions. Most proposed works were interested in static hybridization. The objective of this work is to propose a novel dynamic hybridization method (GPBD) that generates the most suitable sequential hybridization between GA, PSO, BAT, and DE metaheuristics, according to each problem. The authors choose to test this approach for solving the best feature selection problem in a wrapper tactic, performed on face image recognition datasets, with the k-nearest neighbor (KNN) learning algorithm. The comparative study of the metaheuristics and their hybridization GPBD shows that the proposed approach achieved the best results. It was definitely competitive with other filter approaches proposed in the literature. It achieved a perfect accuracy score of 100% for Orl10P, Pix10P, and PIE10P datasets.


2020 ◽  
Author(s):  
Hoda Heidari ◽  
Zahra Einalou ◽  
Mehrdad Dadgostar ◽  
Hamidreza Hosseinzadeh

Abstract Most of the studies in the field of Brain-Computer Interface (BCI) based on electroencephalography have a wide range of applications. Extracting Steady State Visual Evoked Potential (SSVEP) is regarded as one of the most useful tools in BCI systems. In this study, different methods such as feature extraction with different spectral methods (Shannon entropy, skewness, kurtosis, mean, variance) (bank of filters, narrow-bank IIR filters, and wavelet transform magnitude), feature selection performed by various methods (decision tree, principle component analysis (PCA), t-test, Wilcoxon, Receiver operating characteristic (ROC)), and classification step applying k nearest neighbor (k-NN), perceptron, support vector machines (SVM), Bayesian, multiple layer perceptron (MLP) were compared from the whole stream of signal processing. Through combining such methods, the effective overview of the study indicated the accuracy of classical methods. In addition, the present study relied on a rather new feature selection described by decision tree and PCA, which is used for the BCI-SSVEP systems. Finally, the obtained accuracies were calculated based on the four recorded frequencies representing four directions including right, left, up, and down.


Sensors ◽  
2020 ◽  
Vol 20 (5) ◽  
pp. 1447
Author(s):  
Pan Huang ◽  
Yanping Li ◽  
Xiaoyi Lv ◽  
Wen Chen ◽  
Shuxian Liu

Action recognition algorithms are widely used in the fields of medical health and pedestrian dead reckoning (PDR). The classification and recognition of non-normal walking actions and normal walking actions are very important for improving the accuracy of medical health indicators and PDR steps. Existing motion recognition algorithms focus on the recognition of normal walking actions, and the recognition of non-normal walking actions common to daily life is incomplete or inaccurate, resulting in a low overall recognition accuracy. This paper proposes a microelectromechanical system (MEMS) action recognition method based on Relief-F feature selection and relief-bagging-support vector machine (SVM). Feature selection using the Relief-F algorithm reduces the dimensions by 16 and reduces the optimization time by an average of 9.55 s. Experiments show that the improved algorithm for identifying non-normal walking actions has an accuracy of 96.63%; compared with Decision Tree (DT), it increased by 11.63%; compared with k-nearest neighbor (KNN), it increased by 26.62%; and compared with random forest (RF), it increased by 11.63%. The average Area Under Curve (AUC) of the improved algorithm improved by 0.1143 compared to KNN, by 0.0235 compared to DT, and by 0.04 compared to RF.


2018 ◽  
Vol 30 (06) ◽  
pp. 1850044 ◽  
Author(s):  
Elias Ebrahimzadeh ◽  
Farahnaz Fayaz ◽  
Mehran Nikravan ◽  
Fereshteh Ahmadi ◽  
Mohammadjavad Rahimi Dolatabad

Herniation in the lumbar area is one of the most common diseases which results in lower back pain (LBP) causing discomfort and inconvenience in the patients’ daily lives. A computer aided diagnosis (CAD) system can be of immense benefit as it generates diagnostic results within a short time while increasing precision of diagnosis and eliminating human errors. We have proposed a new method for automatic diagnosis of lumbar disc herniation based on clinical MRI data. We use T2-W sagittal and myelograph images. The presented method has been applied on 30 clinical cases, each containing 7 discs (210 lumbar discs) for the herniation diagnosis. We employ Otsu thresholding method to extract the spinal cord from MR images of lumbar disc. A third order polynomial is then aligned on the extracted spinal cords, and by the end of preprocessing stage, all the T2-W sagittal images will have been prepared for specifying disc boundary and labeling. Having extracted an ROI for each disc, we proceed to use intensity and shape features for classification. The extracted features have been selected by Local Subset Feature Selection. The results demonstrated 91.90%, 92.38% and 95.23% accuracy for artificial neural network, K-nearest neighbor and support vector machine (SVM) classifiers respectively, indicating the superiority of the proposed method to those mentioned in similar studies.


Author(s):  
Sophia S ◽  
Rajamohana SP

In recent times, online shoppers are technically knowledgeable and open to product reviews. They usually read the buyer reviews and ratings before purchasing any product from ecommerce website. For the better understanding of products or services, reviews provided by the customers gives the vital source of information. In order to buy the right products for the individuals and to make the business decisions for the Organization online reviews are very important. These reviews or opinions in turn, allow us to find out the strength and weakness of the products. Spam reviews are written in order to falsely promote or demote a few target products or services. Also, detecting the spam reviews has also become more critical issue for the customer to make good decision during the purchase of the product. A major problem in identifying the fake review detection is high dimensionality of the feature space. Therefore, feature selection is an essential step in the fake review detection to reduce dimensionality of the feature space and to improve the classification accuracy. Hence it is important to detect the spam reviews but the major issues in spam review detection are the high dimensionality of feature space which contains redundant, noisy and irrelevant features. To resolve this, Deep Learning Techniques for selecting features is necessary. To classify the features, classifiers such as Naive Bayes, K Nearest Neighbor are used. An analysis of the various techniques employed to identify false and genuine reviews has been surveyed.


2014 ◽  
Vol 701-702 ◽  
pp. 110-113
Author(s):  
Qi Rui Zhang ◽  
He Xian Wang ◽  
Jiang Wei Qin

This paper reports a comparative study of feature selection algorithms on a hyperlipimedia data set. Three methods of feature selection were evaluated, including document frequency (DF), information gain (IG) and aχ2 statistic (CHI). The classification systems use a vector to represent a document and use tfidfie (term frequency, inverted document frequency, and inverted entropy) to compute term weights. In order to compare the effectives of feature selection, we used three classification methods: Naïve Bayes (NB), k Nearest Neighbor (kNN) and Support Vector Machines (SVM). The experimental results show that IG and CHI outperform significantly DF, and SVM and NB is more effective than KNN when macro-averagingF1 measure is used. DF is suitable for the task of large text classification.


Sign in / Sign up

Export Citation Format

Share Document