An Effective Feature Selection Scheme via Genetic Algorithm Using Mutual Information

Feature selection aims to choose an optimal subset of features that are necessary and sufficient to improve the generalization performance and the running efficiency of the learning algorithm. To get the optimal subset in the feature selection process, a hybrid feature selection based on mutual information and genetic algorithm is proposed in this paper. In order to make full use of the advantages of filter and wrapper model, the algorithm is divided into two phases: the filter phase and the wrapper phase. In the filter phase, this algorithm first uses the mutual information to sort the feature, and provides the heuristic information for the subsequent genetic algorithm, to accelerate the search process of the genetic algorithm. In the wrapper phase, using the genetic algorithm as the search strategy, considering the performance of the classifier and dimension of subset as an evaluation criterion, search the best subset of features. Experimental results on benchmark datasets show that the proposed algorithm has higher classification accuracy and smaller feature dimension, and its running time is less than the time of using genetic algorithm.

Download Full-text

Evaluation of Mutual Information, Genetic Algorithm and SVR for Feature Selection in QSAR Regression

Current Drug Discovery Technologies ◽

10.2174/157016311795563839 ◽

2011 ◽

Vol 8 (2) ◽

pp. 107-111 ◽

Cited By ~ 7

Author(s):

Jianwen Fang ◽

David Tai

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Mutual Information

Download Full-text

Effective feature selection scheme using mutual information

Neurocomputing ◽

10.1016/j.neucom.2004.01.194 ◽

2005 ◽

Vol 63 ◽

pp. 325-343 ◽

Cited By ~ 73

Author(s):

D. Huang ◽

Tommy W.S. Chow

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Selection Scheme

Download Full-text

A Hybrid Genetic Algorithm for Feature Selection Based on Mutual Information

Information Theory and Statistical Learning ◽

10.1007/978-0-387-84816-7_6 ◽

2008 ◽

pp. 125-152 ◽

Cited By ~ 1

Author(s):

Jinjie Huang ◽

Panxiang Rong

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Mutual Information ◽

Hybrid Genetic Algorithm

Download Full-text

Dynamic Genetic Algorithm-based Feature Selection Scheme for Machine Health Prognostics

Procedia CIRP ◽

10.1016/j.procir.2016.10.026 ◽

2016 ◽

Vol 56 ◽

pp. 316-320 ◽

Cited By ~ 5

Author(s):

Lei Lu ◽

Jihong Yan ◽

Yue Meng

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Selection Scheme ◽

Machine Health

Download Full-text

Feature selection method based on the improved of mutual information and genetic algorithm

2009 IEEE International Symposium on IT in Medicine & Education ◽

10.1109/itime.2009.5236305 ◽

2009 ◽

Author(s):

Ye Qiu ◽

Peiyu Liu ◽

Yuzhen Yang

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Mutual Information ◽

Feature Selection Method ◽

Selection Method

Download Full-text

A Novel Feature Selection Scheme and a Diversified-Input SVM-Based Classifier for Sensor Fault Classification

Journal of Sensors ◽

10.1155/2018/7467418 ◽

2018 ◽

Vol 2018 ◽

pp. 1-21 ◽

Cited By ~ 4

Author(s):

Sana Ullah Jan ◽

Insoo Koo

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Input Vector ◽

Fault Classification ◽

Sensor Fault ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Target Class ◽

Selection Scheme ◽

Svm Model

The efficiency of a binary support vector machine- (SVM-) based classifier depends on the combination and the number of input features extracted from raw signals. Sometimes, a combination of individual good features does not perform well in discriminating a class due to a high level of relevance to a second class also. Moreover, an increase in the dimensions of an input vector also degrades the performance of a classifier in most cases. To get efficient results, it is needed to input a combination of the lowest possible number of discriminating features to a classifier. In this paper, we propose a framework to improve the performance of an SVM-based classifier for sensor fault classification in two ways: firstly, by selecting the best combination of features for a target class from a feature pool and, secondly, by minimizing the dimensionality of input vectors. To obtain the best combination of features, we propose a novel feature selection algorithm that selects m out of M features having the maximum mutual information (or relevance) with a target class and the minimum mutual information with nontarget classes. This technique ensures to select the features sensitive to the target class exclusively. Furthermore, we propose a diversified-input SVM (DI-SVM) model for multiclass classification problems to achieve our second objective which is to reduce the dimensions of the input vector. In this model, the number of SVM-based classifiers is the same as the number of classes in the dataset. However, each classifier is fed with a unique combination of features selected by a feature selection scheme for a target class. The efficiency of the proposed feature selection algorithm is shown by comparing the results obtained from experiments performed with and without feature selection. Furthermore, the experimental results in terms of accuracy, receiver operating characteristics (ROC), and the area under the ROC curve (AUC-ROC) show that the proposed DI-SVM model outperforms the conventional model of SVM, the neural network, and the k-nearest neighbor algorithm for sensor fault detection and classification.

Download Full-text

A hybrid genetic algorithm for feature selection wrapper based on mutual information

Pattern Recognition Letters ◽

10.1016/j.patrec.2007.05.011 ◽

2007 ◽

Vol 28 (13) ◽

pp. 1825-1844 ◽

Cited By ~ 198

Author(s):

Jinjie Huang ◽

Yunze Cai ◽

Xiaoming Xu

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Mutual Information ◽

Hybrid Genetic Algorithm

Download Full-text