BoostFS: A Boosting-Based Irrelevant Feature Selection Algorithm

Author(s):  
Qi-Guang Miao ◽  
Ying Cao ◽  
Jian-Feng Song ◽  
Jiachen Liu ◽  
Yining Quan

In a learning process, features play a fundamental role. In this paper, we propose a Boosting-based feature selection algorithm called BoostFS. It extends AdaBoost which is designed for classification problems to feature selection. BoostFS maintains a distribution over training samples which is initialized from the uniform distribution. In each iteration, a decision stump is trained under the sample distribution and then the sample distribution is adjusted so that it is orthogonal to the classification results of all the generated stumps. Because a decision stump can also be regarded as one selected feature, BoostFS is capable to select a subset of features that are irrelevant to each other as much as possible. Experimental results on synthetic datasets, five UCI datasets and a real malware detection dataset all show that the features selected by BoostFS help to improve learning algorithms in classification problems, especially when the original feature set contains redundant features.

2010 ◽  
Vol 44-47 ◽  
pp. 3538-3542
Author(s):  
Ai Guo Li ◽  
Xin Zhou ◽  
Jiu Long Zhang

In order to overcome the disadvantage of most inverse classification algorithms address discrete attributes and can not deal with quantitative attributes. The discretization algorithms are applied to the inverse classification algorithms, and the main idea is: firstly, a group of feature attributes are selected by using feature selection algorithm; then, the quantitative attributes are discretized by using discretization algorithms, and the inverted statistics are constructed on the training samples; finally, the test samples are analyzed. Experimental results on IRIS and Ecoli datasets show that this method could find the class label effectively and estimate the missing values accurately, and the results were not worse than ISGNN and kNN.


Author(s):  
Donald Douglas Atsa'am

A filter feature selection algorithm is developed and its performance tested. In the initial step, the algorithm dichotomizes the dataset then separately computes the association between each predictor and the class variable using relative odds (odds ratios). The value of the odds ratios becomes the importance ranking of the corresponding explanatory variable in determining the output. Logistic regression classification is deployed to test the performance of the new algorithm in comparison with three existing feature selection algorithms: the Fisher index, Pearson's correlation, and the varImp function. A number of experimental datasets are employed, and in most cases, the subsets selected by the new algorithm produced models with higher classification accuracy than the subsets suggested by the existing feature selection algorithms. Therefore, the proposed algorithm is a reliable alternative in filter feature selection for binary classification problems.


Energies ◽  
2021 ◽  
Vol 14 (5) ◽  
pp. 1238
Author(s):  
Supanat Chamchuen ◽  
Apirat Siritaratiwat ◽  
Pradit Fuangfoo ◽  
Puripong Suthisopapan ◽  
Pirat Khunkitti

Power quality disturbance (PQD) is an important issue in electrical distribution systems that needs to be detected promptly and identified to prevent the degradation of system reliability. This work proposes a PQD classification using a novel algorithm, comprised of the artificial bee colony (ABC) and the particle swarm optimization (PSO) algorithms, called “adaptive ABC-PSO” as the feature selection algorithm. The proposed adaptive technique is applied to a combination of ABC and PSO algorithms, and then used as the feature selection algorithm. A discrete wavelet transform is used as the feature extraction method, and a probabilistic neural network is used as the classifier. We found that the highest classification accuracy (99.31%) could be achieved through nine optimally selected features out of all 72 extracted features. Moreover, the proposed PQD classification system demonstrated high performance in a noisy environment, as well as the real distribution system. When comparing the presented PQD classification system’s performance to previous studies, PQD classification accuracy using adaptive ABC-PSO as the optimal feature selection algorithm is considered to be at a high-range scale; therefore, the adaptive ABC-PSO algorithm can be used to classify the PQD in a practical electrical distribution system.


Sign in / Sign up

Export Citation Format

Share Document