A Population-Based Incremental Learning approach to microarray gene expression feature selection

Author(s):  
Meir Perez ◽  
David M Rubin ◽  
Tshilidzi Marwala ◽  
Lesley E Scott ◽  
Wendy Stevens

In the field of microarray gene expression research, the high dimension of the features with a comparatively small sample size of these data became necessary for the development of a robust and efficient feature selection method in order to perform classification task more precisely on gene expression data. We propose the hybrid feature selection (mRMRAGA) approach in this paper, which combines the minimum redundancy and maximum relevance (mRMR) with the adaptive genetic algorithm (AGA). The mRMR method is frequently used to identify the characteristics more accurately for gene and its phenotypes. Then their relevance is narrowed down which is described in pairing with its relevant feature selection. This approach is known as Minimum Redundancy and Maximum Relevance. The Genetic Algorithm (GA) has been propelled with the procedure of natural selection and it is based on heuristic search method. And the adaptive genetic algorithm is improvised one which gives better performance. We have conducted an experiment on four benchmarked dataset using our proposed approach and then classified using four well-known classification approaches. The accuracy was measured and observed that it gives better performance compared to the other conventional feature selection methods.


2018 ◽  
Vol 21 (6) ◽  
pp. 420-430 ◽  
Author(s):  
Shuaiqun Wang ◽  
Wei Kong ◽  
Aorigele ◽  
Jin Deng ◽  
Shangce Gao ◽  
...  

Aims and Objective: Redundant information of microarray gene expression data makes it difficult for cancer classification. Hence, it is very important for researchers to find appropriate ways to select informative genes for better identification of cancer. This study was undertaken to present a hybrid feature selection method mRMR-ICA which combines minimum redundancy maximum relevance (mRMR) with imperialist competition algorithm (ICA) for cancer classification in this paper. Materials and Methods: The presented algorithm mRMR-ICA utilizes mRMR to delete redundant genes as preprocessing and provide the small datasets for ICA for feature selection. It will use support vector machine (SVM) to evaluate the classification accuracy for feature genes. The fitness function includes classification accuracy and the number of selected genes. Results: Ten benchmark microarray gene expression datasets are used to test the performance of mRMR-ICA. Experimental results including the accuracy of cancer classification and the number of informative genes are improved for mRMR-ICA compared with the original ICA and other evolutionary algorithms. Conclusion: The comparison results demonstrate that mRMR-ICA can effectively delete redundant genes to ensure that the algorithm selects fewer informative genes to get better classification results. It also can shorten calculation time and improve efficiency.


Author(s):  
Manoranjan Dash ◽  
Vivekanand Gopalkrishnan

Feature selection and tuple selection help the classifier to focus to achieve similar (or even better) accuracy as compared to the classification without feature selection and tuple selection. Although feature selection and tuple selection have been studied earlier in various research areas such as machine learning, data mining, and so on, they have rarely been studied together. The contribution of this chapter is that the authors propose a novel distance measure to select the most representative features and tuples. Their experiments are conducted over some microarray gene expression datasets, UCI machine learning and KDD datasets. Results show that the proposed method outperforms the existing methods quite significantly.


Sign in / Sign up

Export Citation Format

Share Document