A differential evolution based feature combination selection algorithm for high-dimensional data

2021 ◽  
Vol 547 ◽  
pp. 870-886
Author(s):  
Boxin Guan ◽  
Yuhai Zhao ◽  
Ying Yin ◽  
Yuan Li
Author(s):  
Parul Agarwal ◽  
Shikha Mehta

Subspace clustering approaches cluster high dimensional data in different subspaces. It means grouping the data with different relevant subsets of dimensions. This technique has become very effective as a distance measure becomes ineffective in a high dimensional space. This chapter presents a novel evolutionary approach to a bottom up subspace clustering SUBSPACE_DE which is scalable to high dimensional data. SUBSPACE_DE uses a self-adaptive DBSCAN algorithm to perform clustering in data instances of each attribute and maximal subspaces. Self-adaptive DBSCAN clustering algorithms accept input from differential evolution algorithms. The proposed SUBSPACE_DE algorithm is tested on 14 datasets, both real and synthetic. It is compared with 11 existing subspace clustering algorithms. Evaluation metrics such as F1_Measure and accuracy are used. Performance analysis of the proposed algorithms is considerably better on a success rate ratio ranking in both accuracy and F1_Measure. SUBSPACE_DE also has potential scalability on high dimensional datasets.


2020 ◽  
Vol 13 (2) ◽  
pp. 223-238
Author(s):  
Abhishek Dixit ◽  
Ashish Mani ◽  
Rohit Bansal

PurposeFeature selection is an important step for data pre-processing specially in the case of high dimensional data set. Performance of the data model is reduced if the model is trained with high dimensional data set, and it results in poor classification accuracy. Therefore, before training the model an important step to apply is the feature selection on the dataset to improve the performance and classification accuracy.Design/methodology/approachA novel optimization approach that hybridizes binary particle swarm optimization (BPSO) and differential evolution (DE) for fine tuning of SVM classifier is presented. The name of the implemented classifier is given as DEPSOSVM.FindingsThis approach is evaluated using 20 UCI benchmark text data classification data set. Further, the performance of the proposed technique is also evaluated on UCI benchmark image data set of cancer images. From the results, it can be observed that the proposed DEPSOSVM techniques have significant improvement in performance over other algorithms in the literature for feature selection. The proposed technique shows better classification accuracy as well.Originality/valueThe proposed approach is different from the previous work, as in all the previous work DE/(rand/1) mutation strategy is used whereas in this study DE/(rand/2) is used and the mutation strategy with BPSO is updated. Another difference is on the crossover approach in our case as we have used a novel approach of comparing best particle with sigmoid function. The core contribution of this paper is to hybridize DE with BPSO combined with SVM classifier (DEPSOSVM) to handle the feature selection problems.


Author(s):  
Smita Chormunge ◽  
Sudarson Jena

<p>Feature selection approach solves the dimensionality problem by removing irrelevant and redundant features. Existing Feature selection algorithms take more time to obtain feature subset for high dimensional data. This paper proposes a feature selection algorithm based on Information gain measures for high dimensional data termed as IFSA (Information gain based Feature Selection Algorithm) to produce optimal feature subset in efficient time and improve the computational performance of learning algorithms. IFSA algorithm works in two folds: First apply filter on dataset. Second produce the small feature subset by using information gain measure. Extensive experiments are carried out to compare proposed algorithm and other methods with respect to two different classifiers (Naive bayes and IBK) on microarray and text data sets. The results demonstrate that IFSA not only produces the most select feature subset in efficient time but also improves the classifier performance.</p>


Sign in / Sign up

Export Citation Format

Share Document