An Efficient Filter-Based Feature Selection Model to Identify Significant Features from High-Dimensional Microarray Data

2020 ◽  
Vol 45 (4) ◽  
pp. 2619-2630
Author(s):  
D. M. Deepak Raj ◽  
R. Mohanasundaram
Author(s):  
David J. Dittman ◽  
Taghi M. Khoshgoftaar ◽  
Randall Wald ◽  
Jason Van Hulse

2013 ◽  
Vol 9 (16) ◽  
pp. 824-828 ◽  
Author(s):  
Ammu Prasanna Kumar ◽  
◽  
Preeja Valsala

2017 ◽  
Vol 9 ◽  
pp. 107-122 ◽  
Author(s):  
Barnali Sahu ◽  
Satchidananda Dehuri ◽  
Alok Kumar Jagadev

2019 ◽  
Vol 76 (8) ◽  
pp. 5745-5762
Author(s):  
Deepak Raj Munirathinam ◽  
Mohanasundaram Ranganadhan

2020 ◽  
Author(s):  
Utkarsh Mahadeo Khaire ◽  
R Dhanalakshmi

Abstract A microarray dataset contains thousands of DNA spots covering almost every gene in the genome. Microarray-based gene expression helps with the diagnosis, prognosis and treatment of cancer. The nature of diseases frequently changes, which in turn generates a considerable volume of data. The main drawback of microarray data is the curse of dimensionality. It hinders useful information and leads to computational instability. The main objective of feature selection is to extract and remove insignificant and irrelevant features to determine the informative genes that cause cancer. Random forest is a well-suited classification algorithm for microarray data. To enhance the importance of the variables, we proposed out-of-bag (OOB) cases in every tree of the forest to count the number of votes for the exact class. The incorporation of random permutation in the variables of these OOB cases enables us to select the crucial features from high-dimensional microarray data. In this study, we analyze the effects of various random forest parameters on the selection procedure. ‘Variable drop fraction’ regulates the forest construction. The higher variable drop fraction value efficiently decreases the dimensionality of the microarray data. Forest built with 800 trees chooses fewer important features under any variable drop fraction value that reduces microarray data dimensionality.


2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Zena M. Hira ◽  
Duncan F. Gillies

We summarise various ways of performing dimensionality reduction on high-dimensional microarray data. Many different feature selection and feature extraction methods exist and they are being widely used. All these methods aim to remove redundant and irrelevant features so that classification of new instances will be more accurate. A popular source of data is microarrays, a biological platform for gathering gene expressions. Analysing microarrays can be difficult due to the size of the data they provide. In addition the complicated relations among the different genes make analysis more difficult and removing excess features can improve the quality of the results. We present some of the most popular methods for selecting significant features and provide a comparison between them. Their advantages and disadvantages are outlined in order to provide a clearer idea of when to use each one of them for saving computational time and resources.


Sign in / Sign up

Export Citation Format

Share Document