scholarly journals Hybrid approaches to feature subset selection for data classification in high-dimensional feature space

2020 ◽  
Vol 9 (1) ◽  
pp. 45
Author(s):  
Maysa Ibrahem Almulla Khalaf ◽  
John Q Gan

This paper proposes two hybrid feature subset selection approaches based on the combination (union or intersection) of both supervised and unsupervised filter approaches before using a wrapper, aiming to obtain low-dimensional features with high accuracy and interpretability and low time consumption. Experiments with the proposed hybrid approaches have been conducted on seven high-dimensional feature datasets. The classifiers adopted are support vector machine (SVM), linear discriminant analysis (LDA), and K-nearest neighbour (KNN). Experimental results have demonstrated the advantages and usefulness of the proposed methods in feature subset selection in high-dimensional space in terms of the number of selected features and time spent to achieve the best classification accuracy.

Author(s):  
K. C. Sharmili ◽  
A. Chilambuchelvan

Feature subset selection assumes an essential part in the fields of data mining and machine learning. A good feature subset selection algorithm can adequately expel unimportant and repetitive elements and consider feature interaction. This not just paves the way to an understanding comprehension of the information, additionally enhances the execution of a learner by improving the generalization capacity and the interpretability of the learning model. Initially, the input micro array dataset is selected from the medical database. Then preprocessing step is done in the input micro array dataset. The resultant output is fed to the second step; here the features are optimally selected using clustering and optimization process. In our proposed technique, the optimal hybrid fuzzy c-means clustering algorithm with artificial bee colony algorithm is applied on the high dimensional micro array dataset to select the important features. Here the proposed method is optimally select the features with the help of social spider optimization algorithm. After that, the classification is done through improved support vector machine classifier. At last, the experimentation is performed by means of different micro array dataset. Experimental results indicate that the proposed classification framework has outperformed by having better accuracy of 93.19% for GLA-BRA-180 dataset when compared existing SVM and neuro fuzzy classifier only achieved 90.69% and 89%.


2016 ◽  
Vol 2016 ◽  
pp. 1-6 ◽  
Author(s):  
Gürcan Yavuz ◽  
Doğan Aydin

Optimal feature subset selection is an important and a difficult task for pattern classification, data mining, and machine intelligence applications. The objective of the feature subset selection is to eliminate the irrelevant and noisy feature in order to select optimum feature subsets and increase accuracy. The large number of features in a dataset increases the computational complexity thus leading to performance degradation. In this paper, to overcome this problem, angle modulation technique is used to reduce feature subset selection problem to four-dimensional continuous optimization problem instead of presenting the problem as a high-dimensional bit vector. To present the effectiveness of the problem presentation with angle modulation and to determine the efficiency of the proposed method, six variants of Artificial Bee Colony (ABC) algorithms employ angle modulation for feature selection. Experimental results on six high-dimensional datasets show that Angle Modulated ABC algorithms improved the classification accuracy with fewer feature subsets.


Sign in / Sign up

Export Citation Format

Share Document