Research on Disease Classification Model and Algorithms Based on Gene Expression Data

Recently, large-scale bioinformatics and genomic data have been generated using advanced biotechnology methods, thus increasing the importance of analyzing such data. Numerous data mining methods have been developed to process genomic data in the field of bioinformatics. We extracted significant genes for the prognosis prediction of 1157 patients using gene expression data from patients with kidney cancer. We then proposed an end-to-end, cost-sensitive hybrid deep learning (COST-HDL) approach with a cost-sensitive loss function for classification tasks on imbalanced kidney cancer data. Here, we combined the deep symmetric auto encoder; the decoder is symmetric to the encoder in terms of layer structure, with reconstruction loss for non-linear feature extraction and neural network with balanced classification loss for prognosis prediction to address data imbalance problems. Combined clinical data from patients with kidney cancer and gene data were used to determine the optimal classification model and estimate classification accuracy by sample type, primary diagnosis, tumor stage, and vital status as risk factors representing the state of patients. Experimental results showed that the COST-HDL approach was more efficient with gene expression data for kidney cancer prognosis than other conventional machine learning and data mining techniques. These results could be applied to extract features from gene biomarkers for prognosis prediction of kidney cancer and prevention and early diagnosis.

Download Full-text

Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data

BMC Bioinformatics ◽

10.1186/1471-2105-8-67 ◽

2007 ◽

Vol 8 (1) ◽

Cited By ~ 13

Author(s):

Xin Zhao ◽

Leo Wang-Kit Cheung

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Gaussian Processes ◽

Microarray Gene Expression Data ◽

Disease Classification ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

The ant colony algorithm for feature selection in high-dimension gene expression data for disease classification

Mathematical Medicine and Biology A Journal of the IMA ◽

10.1093/imammb/dqn001 ◽

2007 ◽

Vol 24 (4) ◽

pp. 413-426 ◽

Cited By ~ 30

Author(s):

K. R. Robbins ◽

W. Zhang ◽

J. K. Bertrand ◽

R. Rekaya

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Expression Data ◽

High Dimension ◽

Ant Colony Algorithm ◽

Ant Colony ◽

Disease Classification ◽

Expression Data

Download Full-text

A signal-to-noise classification model for identification of differentially expressed genes from gene expression data

2011 3rd International Conference on Electronics Computer Technology ◽

10.1109/icectech.2011.5941685 ◽

2011 ◽

Cited By ~ 1

Author(s):

Debahuti Mishra ◽

Barnali Sahu

Keyword(s):

Gene Expression ◽

Differentially Expressed Genes ◽

Gene Expression Data ◽

Differentially Expressed ◽

Classification Model ◽

Expression Data ◽

Signal To Noise ◽

Noise Classification

Download Full-text

Comparative Study of Disease Classification Using Multiple Machine Learning Models Based on Landmark and Non-Landmark Gene Expression Data

Procedia Computer Science ◽

10.1016/j.procs.2021.05.028 ◽

2021 ◽

Vol 185 ◽

pp. 264-273

Author(s):

Xiaoqin Huang ◽

Jian Sun ◽

Satish Mahadevan Srinivasan ◽

Raghvinder S Sangwan

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Comparative Study ◽

Gene Expression Data ◽

Disease Classification ◽

Expression Data ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Bayesian variable selection for disease classification using gene expression data

Bioinformatics ◽

10.1093/bioinformatics/btp638 ◽

2009 ◽

Vol 26 (2) ◽

pp. 215-222 ◽

Cited By ~ 46

Author(s):

Yang Ai-Jun ◽

Song Xin-Yuan

Keyword(s):

Gene Expression ◽

Variable Selection ◽

Gene Expression Data ◽

Bayesian Variable Selection ◽

Disease Classification ◽

Expression Data ◽

Selection For

Download Full-text

Response to "Comments on 'Bayesian variable selection for disease classification using gene expression data'"

Bioinformatics ◽

10.1093/bioinformatics/btr334 ◽

2011 ◽

Vol 27 (15) ◽

pp. 2169-2170 ◽

Cited By ~ 1

Author(s):

X.-Y. Song ◽

Z.-H. Lu

Keyword(s):

Gene Expression ◽

Variable Selection ◽

Gene Expression Data ◽

Bayesian Variable Selection ◽

Disease Classification ◽

Expression Data ◽

Selection For

Download Full-text

Comments on 'Bayesian variable selection for disease classification using gene expression data'

Bioinformatics ◽

10.1093/bioinformatics/btr071 ◽

2011 ◽

Vol 27 (8) ◽

pp. 1194-1194 ◽

Cited By ~ 2

Author(s):

M. C. Baragatti ◽

D. Pommeret

Keyword(s):

Gene Expression ◽

Variable Selection ◽

Gene Expression Data ◽

Bayesian Variable Selection ◽

Disease Classification ◽

Expression Data ◽

Selection For

Download Full-text

Classification of Gene Expression Data using Efficient Feature Selection Technique and Resampling Method

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.e7816.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 406-414

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Expression Data ◽

Classification Model ◽

Support Vector ◽

Expression Data ◽

Feature Selection Technique ◽

Selection Technique ◽

Resampling Method

Microarray technology has been developed as one of the powerful tools that have attracted many researchers to analyze gene expression level for a given organism. It has been observed that gene expression data have very large (in terms of thousands) of features and less number of samples (in terms of hundreds). This characteristic makes difficult to do an analysis of gene expression data. Hence efficient feature selection technique must be applied before we go for any kind of analysis. Feature selection plays a vital role in the classification of gene expression data. There are several feature selection techniques have been induced in this field. But Support Vector Machine with Recursive Feature Elimination (SVM-RFE) has been proven as the promising feature selection methods among others. SVM-RFE ranks the genes (features) by training the SVM classification model and with the combination of RFE method key genes are selected. Huge time consumption is the main issue of SVM-RFE. We introduced an efficient implementation of linier SVM to overcome this problem and improved the RFE with variable step size. Then, combined method was used for selecting informative genes. Effective resampling method is proposed to preprocess the datasets. This is used to make the distribution of samples balanced, which gives more reliable classification results. In this paper, we have also studied the applicability of common classifiers. Detailed experiments are conducted on four commonly used microarray gene expression datasets. The results show that the proposed method comparable classification performance

Download Full-text

Research on Disease Classification Model and Algorithms Based on Gene Expression Data

An Effective Classification Model for Cancer Diagnosis Using Micro Array Gene Expression Data

Classification of Kidney Cancer Data Using Cost-Sensitive Hybrid Deep Learning Approach

Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data

The ant colony algorithm for feature selection in high-dimension gene expression data for disease classification

A signal-to-noise classification model for identification of differentially expressed genes from gene expression data

Comparative Study of Disease Classification Using Multiple Machine Learning Models Based on Landmark and Non-Landmark Gene Expression Data

Bayesian variable selection for disease classification using gene expression data

Response to "Comments on 'Bayesian variable selection for disease classification using gene expression data'"

Comments on 'Bayesian variable selection for disease classification using gene expression data'

Classification of Gene Expression Data using Efficient Feature Selection Technique and Resampling Method

Export Citation Format