Efficient Feature Subset Selection Algorithm for High Dimensional Data

<p>Feature selection approach solves the dimensionality problem by removing irrelevant and redundant features. Existing Feature selection algorithms take more time to obtain feature subset for high dimensional data. This paper proposes a feature selection algorithm based on Information gain measures for high dimensional data termed as IFSA (Information gain based Feature Selection Algorithm) to produce optimal feature subset in efficient time and improve the computational performance of learning algorithms. IFSA algorithm works in two folds: First apply filter on dataset. Second produce the small feature subset by using information gain measure. Extensive experiments are carried out to compare proposed algorithm and other methods with respect to two different classifiers (Naive bayes and IBK) on microarray and text data sets. The results demonstrate that IFSA not only produces the most select feature subset in efficient time but also improves the classifier performance.</p>

Download Full-text

Efficient Feature Subset Selection Algorithm for High Dimensional Data

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i4.pp1880-1888 ◽

2016 ◽

Vol 6 (4) ◽

pp. 1880 ◽

Cited By ~ 3

Author(s):

Smita Chormunge ◽

Sudarson Jena

Keyword(s):

Feature Selection ◽

Information Gain ◽

High Dimensional Data ◽

Feature Subset Selection ◽

High Dimensional ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Computational Performance ◽

Optimal Feature Subset

Download Full-text

BagMeLiF: stable boosting-based hybrid-ensemble feature selection algorithm for high-dimensional data

2020 International Conference on Control, Robotics and Intelligent System ◽

10.1145/3437802.3437835 ◽

2020 ◽

Author(s):

Nikita Pilnenskiy ◽

Ivan Smetannikov

Keyword(s):

Feature Selection ◽

High Dimensional Data ◽

High Dimensional ◽

Selection Algorithm ◽

Feature Selection Algorithm

Download Full-text

Modulation Recognition of Digital Multimedia Signal Based on Data Feature Selection

International Journal of Mobile Computing and Multimedia Communications ◽

10.4018/ijmcmc.2017070107 ◽

2017 ◽

Vol 8 (3) ◽

pp. 90-111 ◽

Cited By ~ 2

Author(s):

Hui Wang ◽

Li Li Guo ◽

Yun Lin

Keyword(s):

Feature Selection ◽

Information Entropy ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Modulation Recognition ◽

Signal Modulation ◽

Digital Multimedia ◽

Optimal Feature Subset ◽

Optimal Feature

Automatic modulation recognition is very important for the receiver design in the broadband multimedia communication system, and the reasonable signal feature extraction and selection algorithm is the key technology of Digital multimedia signal recognition. In this paper, the information entropy is used to extract the single feature, which are power spectrum entropy, wavelet energy spectrum entropy, singular spectrum entropy and Renyi entropy. And then, the feature selection algorithm of distance measurement and Sequential Feature Selection(SFS) are presented to select the optimal feature subset. Finally, the BP neural network is used to classify the signal modulation. The simulation result shows that the four-different information entropy can be used to classify different signal modulation, and the feature selection algorithm is successfully used to choose the optimal feature subset and get the best performance.

Download Full-text

A FAST FEATURE SELECTION ALGORITHM FOR CLUSTERING IN FOR HIGH-DIMENSIONAL DATA

International Journal of Advance Engineering and Research Development ◽

10.21090/ijaerd.63335 ◽

2016 ◽

Vol 3 (12) ◽

Keyword(s):

Feature Selection ◽

High Dimensional Data ◽

High Dimensional ◽

Selection Algorithm ◽

Feature Selection Algorithm

Download Full-text

Overview of feature subset selection algorithm for high dimensional data

2017 International Conference on Inventive Systems and Control (ICISC) ◽

10.1109/icisc.2017.8068599 ◽

2017 ◽

Author(s):

Swati S. Gandhi ◽

S. S. Prabhune

Keyword(s):

High Dimensional Data ◽

Subset Selection ◽

Feature Subset Selection ◽

High Dimensional ◽

Feature Subset ◽

Selection Algorithm

Download Full-text

Feature Subset Selection Based on the Genetic Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.774-776.1532 ◽

2013 ◽

Vol 774-776 ◽

pp. 1532-1537

Author(s):

Jing Wei Yang ◽

Si Le Wang ◽

Ying Yi Chen ◽

Su Kui Lu ◽

Wen Zhu Yang

Keyword(s):

Feature Subset Selection ◽

Initial Population ◽

Feature Subset ◽

Crossover Operator ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Fitness Value ◽

Optimal Feature Subset ◽

Crossover And Mutation ◽

Optimal Feature

This paper presents a genetic-based feature selection algorithm for object recognition. Firstly, the proposed algorithm encodes a solution with a binary chromosome. Secondly, the initial population was generated randomly. Thirdly, a crossover operator and a mutation operator are employed to operate on these chromosomes to generate more competency chromosomes. The probability of the crossover and mutation are adjusted dynamically according to the generation number and the fitness value. The proposed algorithm is tested using the features extracted from cotton foreign fiber objects. The results indicate that the proposed algorithm can obtain the optimal feature subset, and can reduce the classification time while keeping the classification accuracy constant.

Download Full-text

Optimal feature subset selection in high dimensional data clustering

International Journal of Business Intelligence and Data Mining ◽

10.1504/ijbidm.2016.081866 ◽

2016 ◽

Vol 11 (3) ◽

pp. 242 ◽

Cited By ~ 2

Author(s):

Kasturi Chandrahaasan Sharmili ◽

Arul Gnanaprakasaam Chilambuchelvan

Keyword(s):

Data Clustering ◽

High Dimensional Data ◽

Subset Selection ◽

Feature Subset Selection ◽

High Dimensional ◽

Feature Subset ◽

Optimal Feature Subset ◽

Optimal Feature

Download Full-text

Optimal Feature Subset Selection in High Dimensional Data Clustering

International Journal of Business Intelligence and Data Mining ◽

10.1504/ijbidm.2016.10002432 ◽

2016 ◽

Vol 1 (1) ◽

pp. 1

Author(s):

K.C. Sharmili ◽

Arul Gnanaprakasaam Chilambuchelvan

Keyword(s):

Data Clustering ◽

High Dimensional Data ◽

Subset Selection ◽

Feature Subset Selection ◽

High Dimensional ◽

Feature Subset ◽

Optimal Feature Subset ◽

Optimal Feature

Download Full-text

Improved Nonnegative Matrix Factorization Based Feature Selection for High Dimensional Data Analysis

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.2344 ◽

2013 ◽

Vol 347-350 ◽

pp. 2344-2348

Author(s):

Lin Cheng Jiang ◽

Wen Tang Tan ◽

Zhen Wen Wang ◽

Feng Jing Yin ◽

Bin Ge ◽

...

Keyword(s):

Feature Selection ◽

Matrix Factorization ◽

Nonnegative Matrix Factorization ◽

High Dimensional Data ◽

Nonnegative Matrix ◽

High Dimensional ◽

Feature Subset ◽

Feature Extraction Method ◽

Optimal Feature Subset ◽

Optimal Feature

Feature selection has become the focus of research areas of applications with high dimensional data. Nonnegative matrix factorization (NMF) is a good method for dimensionality reduction but it cant select the optimal feature subset for its a feature extraction method. In this paper, a two-step strategy method based on improved NMF is proposed.The first step is to get the basis of each catagory in the dataset by NMF. Added constrains can guarantee these basises are sparse and mostly distinguish from each other which can contribute to classfication. An auxiliary function is used to prove the algorithm convergent.The classic ReliefF algorithm is used to weight each feature by all the basis vectors and choose the optimal feature subset in the second step.The experimental results revealed that the proposed method can select a representive and relevant feature subset which is effective in improving the performance of the classifier.

Download Full-text

Feature Selection Algorithm for Palm Bio-Impedance Spectroscopy Based on Immune Clone

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.2712 ◽

2013 ◽

Vol 347-350 ◽

pp. 2712-2716

Author(s):

Lin Tao Lü ◽

Peng Li ◽

Yu Xiang Yang ◽

Fang Tan

Keyword(s):

Feature Selection ◽

Impedance Spectroscopy ◽

Least Squares Method ◽

Feature Model ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Elliptical Model ◽

Optimal Feature Subset ◽

Optimal Feature

According to the features of Palm bio-impedance spectroscopy (BIS) data, this paper suggests a kind of effective feature model of palm BIS data elliptical model. The model combines immune clone algorithm and least squares method, establishes a palm BIS feature selection algorithm, and uses the algorithm to obtain the optimal feature subset that can completely represent the palm BIS data, and then use several classification algorithms for classification and comparison. The experimental results show that accuracy of the feature subset obtained through the algorithm in SVM classification algorithm test can reach 93.2, thereby verifying the algorithm is a valid and reliable palm BIS feature selection algorithm.

Download Full-text