Hybrid feature selection methods for the Classification of Cancer in Micro-array Gene expression data: a Survey

Several researchers have focused on random-forest-based inference methods because of their excellent performance. Some of these inference methods also have a useful ability to analyze both time-series and static gene expression data. However, they are only of use in ranking all of the candidate regulations by assigning them confidence values. None have been capable of detecting the regulations that actually affect a gene of interest. In this study, we propose a method to remove unpromising candidate regulations by combining the random-forest-based inference method with a series of feature selection methods. In addition to detecting unpromising regulations, our proposed method uses outputs from the feature selection methods to adjust the confidence values of all of the candidate regulations that have been computed by the random-forest-based inference method. Numerical experiments showed that the combined application with the feature selection methods improved the performance of the random-forest-based inference method on 99 of the 100 trials performed on the artificial problems. However, the improvement tends to be small, since our combined method succeeded in removing only 19% of the candidate regulations at most. The combined application with the feature selection methods moreover makes the computational cost higher. While a bigger improvement at a lower computational cost would be ideal, we see no impediments to our investigation, given that our aim is to extract as much useful information as possible from a limited amount of gene expression data.

Get full-text (via PubEx)

Incorporating Pathway Information into Feature Selection towards Better Performed Gene Signatures

BioMed Research International ◽

10.1155/2019/2497509 ◽

2019 ◽

Vol 2019 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Suyan Tian ◽

Chi Wang ◽

Bing Wang

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Expression Data ◽

Gene Selection ◽

Selection Process ◽

Biological Knowledge ◽

Expression Data ◽

Selection Methods ◽

Its Gene ◽

Active Research

To analyze gene expression data with sophisticated grouping structures and to extract hidden patterns from such data, feature selection is of critical importance. It is well known that genes do not function in isolation but rather work together within various metabolic, regulatory, and signaling pathways. If the biological knowledge contained within these pathways is taken into account, the resulting method is a pathway-based algorithm. Studies have demonstrated that a pathway-based method usually outperforms its gene-based counterpart in which no biological knowledge is considered. In this article, a pathway-based feature selection is firstly divided into three major categories, namely, pathway-level selection, bilevel selection, and pathway-guided gene selection. With bilevel selection methods being regarded as a special case of pathway-guided gene selection process, we discuss pathway-guided gene selection methods in detail and the importance of penalization in such methods. Last, we point out the potential utilizations of pathway-guided gene selection in one active research avenue, namely, to analyze longitudinal gene expression data. We believe this article provides valuable insights for computational biologists and biostatisticians so that they can make biology more computable.

Get full-text (via PubEx)

Simultaneous feature selection and clustering of micro-array and RNA-sequence gene expression data using multiobjective optimization

International Journal of Machine Learning and Cybernetics ◽

10.1007/s13042-020-01139-x ◽

2020 ◽

Vol 11 (11) ◽

pp. 2541-2563

Author(s):

Abhay Kumar Alok ◽

Pooja Gupta ◽

Sriparna Saha ◽

Vineet Sharma

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Multiobjective Optimization ◽

Gene Expression Data ◽

Expression Data ◽

Rna Sequence ◽

Micro Array

Get full-text (via PubEx)

A Survey on Hybrid Feature Selection Methods in Microarray Gene Expression Data for Cancer Classification

IEEE Access ◽

10.1109/access.2019.2922987 ◽

2019 ◽

Vol 7 ◽

pp. 78533-78548 ◽

Cited By ~ 21

Author(s):

Nada Almugren ◽

Hala Alshamlan

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Expression Data ◽

Microarray Gene Expression Data ◽

Cancer Classification ◽

Expression Data ◽

Selection Methods ◽

Microarray Gene Expression ◽

Microarray Gene

Get full-text (via PubEx)

CLASSIFYING TEMPORAL MICROARRAY DATA BY SELECTING INFORMATIVE GENES

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720013410060 ◽

2013 ◽

Vol 11 (03) ◽

pp. 1341006

Author(s):

QIANG LOU ◽

ZORAN OBRADOVIC

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Expression Data ◽

Microarray Data ◽

Data Sets ◽

Temporal Data ◽

Expression Data ◽

Selection Methods ◽

Temporal Gene Expression ◽

Single Matrix

In order to more accurately predict an individual's health status, in clinical applications it is often important to perform analysis of high-dimensional gene expression data that varies with time. A major challenge in predicting from such temporal microarray data is that the number of biomarkers used as features is typically much larger than the number of labeled subjects. One way to address this challenge is to perform feature selection as a preprocessing step and then apply a classification method on selected features. However, traditional feature selection methods cannot handle multivariate temporal data without applying techniques that flatten temporal data into a single matrix in advance. In this study, a feature selection filter that can directly select informative features from temporal gene expression data is proposed. In our approach, we measure the distance between multivariate temporal data from two subjects. Based on this distance, we define the objective function of temporal margin based feature selection to maximize each subject's temporal margin in its own relevant subspace. The experimental results on synthetic and two real flu data sets provide evidence that our method outperforms the alternatives, which flatten the temporal data in advance.

Get full-text (via PubEx)

Classification of Micro Array Gene Expression Data using Statistical Analysis Approach with Personalized Fuzzy Inference System

International Journal of Computer Applications ◽

10.5120/3787-5215 ◽

2011 ◽

Vol 31 (1) ◽

pp. 5-12

Author(s):

Tamilselvi Madeswaran ◽

G.M.Kadhar Nawaz

Keyword(s):

Gene Expression ◽

Statistical Analysis ◽

Gene Expression Data ◽

Fuzzy Inference System ◽

Fuzzy Inference ◽

Analysis Approach ◽

Expression Data ◽

Inference System ◽

Micro Array

Get full-text (via PubEx)

A Comparative Performance Evaluation of Random Forest Feature Selection on Classification of Hepatocellular Carcinoma Gene Expression Data

2019 3rd International Conference on Informatics and Computational Sciences (ICICoS) ◽

10.1109/icicos48119.2019.8982435 ◽

2019 ◽

Cited By ~ 1

Author(s):

Moh Abdul Latief ◽

Titin Siswantining ◽

Alhadi Bustamam ◽

Devvi Sarwinda

Keyword(s):

Gene Expression ◽

Hepatocellular Carcinoma ◽

Feature Selection ◽

Performance Evaluation ◽

Random Forest ◽

Gene Expression Data ◽

Expression Data ◽

Comparative Performance

Get full-text (via PubEx)