BoostFS: A Boosting-Based Irrelevant Feature Selection Algorithm

In a learning process, features play a fundamental role. In this paper, we propose a Boosting-based feature selection algorithm called BoostFS. It extends AdaBoost which is designed for classification problems to feature selection. BoostFS maintains a distribution over training samples which is initialized from the uniform distribution. In each iteration, a decision stump is trained under the sample distribution and then the sample distribution is adjusted so that it is orthogonal to the classification results of all the generated stumps. Because a decision stump can also be regarded as one selected feature, BoostFS is capable to select a subset of features that are irrelevant to each other as much as possible. Experimental results on synthetic datasets, five UCI datasets and a real malware detection dataset all show that the features selected by BoostFS help to improve learning algorithms in classification problems, especially when the original feature set contains redundant features.

Download Full-text

A new feature selection algorithm for two-class classification problems and application to endometrial cancer

2012 IEEE 51st IEEE Conference on Decision and Control (CDC) ◽

10.1109/cdc.2012.6426819 ◽

2012 ◽

Cited By ~ 10

Author(s):

M. Eren Ahsen ◽

Nitin K. Singh ◽

Todd Boren ◽

M. Vidyasagar ◽

Michael A. White

Keyword(s):

Feature Selection ◽

Endometrial Cancer ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Classification Problems ◽

New Feature

Download Full-text

Inverse Classification Problem of Quantitative Attributes

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.44-47.3538 ◽

2010 ◽

Vol 44-47 ◽

pp. 3538-3542

Author(s):

Ai Guo Li ◽

Xin Zhou ◽

Jiu Long Zhang

Keyword(s):

Feature Selection ◽

Missing Values ◽

Main Idea ◽

Classification Problem ◽

Experimental Results ◽

Classification Algorithms ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Class Label ◽

Training Samples

In order to overcome the disadvantage of most inverse classification algorithms address discrete attributes and can not deal with quantitative attributes. The discretization algorithms are applied to the inverse classification algorithms, and the main idea is: firstly, a group of feature attributes are selected by using feature selection algorithm; then, the quantitative attributes are discretized by using discretization algorithms, and the inverted statistics are constructed on the training samples; finally, the test samples are analyzed. Experimental results on IRIS and Ecoli datasets show that this method could find the class label effectively and estimate the missing values accurately, and the results were not worse than ISGNN and kNN.

Download Full-text

An improved feature selection algorithm with conditional mutual information for classification problems

2013 International Conference on Human Computer Interactions (ICHCI) ◽

10.1109/ichci-ieee.2013.6887802 ◽

2013 ◽

Cited By ~ 1

Author(s):

Jaganathan Palanichamy ◽

Kuppuchamy Ramasamy

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Conditional Mutual Information ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Classification Problems

Download Full-text

Feature Selection Algorithm Using Relative Odds for Data Mining Classification

Big Data Analytics for Sustainable Computing - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-9750-6.ch005 ◽

2020 ◽

pp. 81-106 ◽

Cited By ~ 3

Author(s):

Donald Douglas Atsa'am

Keyword(s):

Feature Selection ◽

Binary Classification ◽

Initial Step ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Classification Problems ◽

Odds Ratios ◽

Relative Odds ◽

Importance Ranking ◽

Selection Algorithms

A filter feature selection algorithm is developed and its performance tested. In the initial step, the algorithm dichotomizes the dataset then separately computes the association between each predictor and the class variable using relative odds (odds ratios). The value of the odds ratios becomes the importance ranking of the corresponding explanatory variable in determining the output. Logistic regression classification is deployed to test the performance of the new algorithm in comparison with three existing feature selection algorithms: the Fisher index, Pearson's correlation, and the varImp function. A number of experimental datasets are employed, and in most cases, the subsets selected by the new algorithm produced models with higher classification accuracy than the subsets suggested by the existing feature selection algorithms. Therefore, the proposed algorithm is a reliable alternative in filter feature selection for binary classification problems.

Download Full-text

An Ensemble-Based Feature Selection Algorithm Using Combination of Support Vector Machine and Filter Methods for Solving Classification Problems

European Journal of Technology and Design ◽

10.13187/ejtd.2013.1.70 ◽

2013 ◽

Vol 1 (1) ◽

pp. 70-76 ◽

Cited By ~ 2

Author(s):

Lev V. Utkin ◽

◽

Yulia A. Zhuk ◽

Anatoly I. Chekh

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Support Vector ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Classification Problems ◽

Filter Methods

Download Full-text

Research and implementation of Chinese text feature selection algorithm based on χ2statistics

Computational Intelligence and Industrial Engineering ◽

10.2495/ciie140191 ◽

2014 ◽

Author(s):

Weijiang Wu ◽

Shengkai Wen ◽

Dongmei Xia ◽

Guohe Li

Keyword(s):

Feature Selection ◽

Chinese Text ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Text Feature

Download Full-text

BagMeLiF: stable boosting-based hybrid-ensemble feature selection algorithm for high-dimensional data

2020 International Conference on Control, Robotics and Intelligent System ◽

10.1145/3437802.3437835 ◽

2020 ◽

Author(s):

Nikita Pilnenskiy ◽

Ivan Smetannikov

Keyword(s):

Feature Selection ◽

High Dimensional Data ◽

High Dimensional ◽

Selection Algorithm ◽

Feature Selection Algorithm

Download Full-text

Hybrid Feature Selection Algorithm Based on Discrete Artificial Bee Colony for Parkinson Diagnosis

ACM Transactions on Internet Technology ◽

10.1145/3397161 ◽

2020 ◽

Cited By ~ 1

Author(s):

Haolun Li ◽

Chi-Man Pun ◽

Feng Xu ◽

Longsheng Pan ◽

Rui Zong ◽

...

Keyword(s):

Feature Selection ◽

Artificial Bee Colony ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Bee Colony

Download Full-text

High-Accuracy Power Quality Disturbance Classification Using the Adaptive ABC-PSO as Optimal Feature Selection Algorithm

Energies ◽

10.3390/en14051238 ◽

2021 ◽

Vol 14 (5) ◽

pp. 1238

Author(s):

Supanat Chamchuen ◽

Apirat Siritaratiwat ◽

Pradit Fuangfoo ◽

Puripong Suthisopapan ◽

Pirat Khunkitti

Keyword(s):

Feature Selection ◽

Power Quality ◽

Distribution System ◽

Classification Accuracy ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Electrical Distribution ◽

Power Quality Disturbance ◽

Optimal Feature Selection ◽

Optimal Feature

Power quality disturbance (PQD) is an important issue in electrical distribution systems that needs to be detected promptly and identified to prevent the degradation of system reliability. This work proposes a PQD classification using a novel algorithm, comprised of the artificial bee colony (ABC) and the particle swarm optimization (PSO) algorithms, called “adaptive ABC-PSO” as the feature selection algorithm. The proposed adaptive technique is applied to a combination of ABC and PSO algorithms, and then used as the feature selection algorithm. A discrete wavelet transform is used as the feature extraction method, and a probabilistic neural network is used as the classifier. We found that the highest classification accuracy (99.31%) could be achieved through nine optimally selected features out of all 72 extracted features. Moreover, the proposed PQD classification system demonstrated high performance in a noisy environment, as well as the real distribution system. When comparing the presented PQD classification system’s performance to previous studies, PQD classification accuracy using adaptive ABC-PSO as the optimal feature selection algorithm is considered to be at a high-range scale; therefore, the adaptive ABC-PSO algorithm can be used to classify the PQD in a practical electrical distribution system.

Download Full-text