BagMeLiF: stable boosting-based hybrid-ensemble feature selection algorithm for high-dimensional data

<p>Feature selection approach solves the dimensionality problem by removing irrelevant and redundant features. Existing Feature selection algorithms take more time to obtain feature subset for high dimensional data. This paper proposes a feature selection algorithm based on Information gain measures for high dimensional data termed as IFSA (Information gain based Feature Selection Algorithm) to produce optimal feature subset in efficient time and improve the computational performance of learning algorithms. IFSA algorithm works in two folds: First apply filter on dataset. Second produce the small feature subset by using information gain measure. Extensive experiments are carried out to compare proposed algorithm and other methods with respect to two different classifiers (Naive bayes and IBK) on microarray and text data sets. The results demonstrate that IFSA not only produces the most select feature subset in efficient time but also improves the classifier performance.</p>

Download Full-text

OPTIMAL FEATURE SELECTION ALGORITHM FOR HIGH DIMENSIONAL DATA SETS USING PARTICLE SWARM OPTIMIZATION

International Journal of Latest Trends in Engineering and Technology ◽

10.21172/1.82.028 ◽

2017 ◽

Vol 8 (2) ◽

Keyword(s):

Feature Selection ◽

Particle Swarm Optimization ◽

High Dimensional Data ◽

High Dimensional ◽

Data Sets ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Swarm Optimization ◽

Optimal Feature Selection ◽

Optimal Feature

Download Full-text

Efficient Feature Subset Selection Algorithm for High Dimensional Data

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i4.9800 ◽

2016 ◽

Vol 6 (4) ◽

pp. 1880 ◽

Cited By ~ 1

Author(s):

Smita Chormunge ◽

Sudarson Jena

Keyword(s):

Feature Selection ◽

Information Gain ◽

High Dimensional Data ◽

Feature Subset Selection ◽

High Dimensional ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Computational Performance ◽

Optimal Feature Subset

<p>Feature selection approach solves the dimensionality problem by removing irrelevant and redundant features. Existing Feature selection algorithms take more time to obtain feature subset for high dimensional data. This paper proposes a feature selection algorithm based on Information gain measures for high dimensional data termed as IFSA (Information gain based Feature Selection Algorithm) to produce optimal feature subset in efficient time and improve the computational performance of learning algorithms. IFSA algorithm works in two folds: First apply filter on dataset. Second produce the small feature subset by using information gain measure. Extensive experiments are carried out to compare proposed algorithm and other methods with respect to two different classifiers (Naive bayes and IBK) on microarray and text data sets. The results demonstrate that IFSA not only produces the most select feature subset in efficient time but also improves the classifier performance.</p>

Download Full-text

Research of Medical High-Dimensional Imbalanced Data Classification Ensemble Feature Selection Algorithm with Random Forest

2017 International Conference on Smart Grid and Electrical Automation (ICSGEA) ◽

10.1109/icsgea.2017.158 ◽

2017 ◽

Cited By ~ 2

Author(s):

Min Zhu ◽

Bo Su ◽

Gangmin Ning

Keyword(s):

Feature Selection ◽

Random Forest ◽

Imbalanced Data ◽

Data Classification ◽

High Dimensional ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Imbalanced Data Classification

Download Full-text

A hybrid feature selection algorithm combining ReliefF and Particle swarm optimization for high-dimensional medical data

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202948 ◽

2021 ◽

pp. 1-15

Author(s):

Zhaozhao Xu ◽

Derong Shen ◽

Yue Kou ◽

Tiezheng Nie

Keyword(s):

Feature Selection ◽

Particle Swarm Optimization ◽

Random Forest ◽

Classification Accuracy ◽

Particle Swarm ◽

Medical Data ◽

High Dimensional ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Swarm Optimization

Due to high-dimensional feature and strong correlation of features, the classification accuracy of medical data is not as good enough as expected. feature selection is a common algorithm to solve this problem, and selects effective features by reducing the dimensionality of high-dimensional data. However, traditional feature selection algorithms have the blindness of threshold setting and the search algorithms are liable to fall into a local optimal solution. Based on it, this paper proposes a hybrid feature selection algorithm combining ReliefF and Particle swarm optimization. The algorithm is mainly divided into three parts: Firstly, the ReliefF is used to calculate the feature weight, and the features are ranked by the weight. Then ranking feature is grouped according to the density equalization, where the density of features in each group is the same. Finally, the Particle Swarm Optimization algorithm is used to search the ranking feature groups, and the feature selection is performed according to a new fitness function. Experimental results show that the random forest has the highest classification accuracy on the features selected. More importantly, it has the least number of features. In addition, experimental results on 2 medical datasets show that the average accuracy of random forest reaches 90.20%, which proves that the hybrid algorithm has a certain application value.

Download Full-text