Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data

Statistical analysis of microarray gene expression data has recently attracted a great deal of attention. One problem of interest is to relate genes to survival outcomes of patients with the purpose of building regression models for the prediction of future patients' survival based on their gene expression data. For this, several authors have discussed the use of the proportional hazards or Cox model after reducing the dimension of the gene expression data. This paper presents a new approach to conduct the Cox survival analysis of microarray gene expression data with the focus on models' predictive ability. The method modifies the correlation principal component regression (Sun, 1995) to handle the censoring problem of survival data. The results based on simulated data and a set of publicly available data on diffuse large B-cell lymphoma show that the proposed method works well in terms of models' robustness and predictive ability in comparison with some existing partial least squares approaches. Also, the new approach is simpler and easy to implement.

Download Full-text

Cancer classification based on microarray gene expression data using a principal component accumulation method

Science China Chemistry ◽

10.1007/s11426-011-4263-5 ◽

2011 ◽

Vol 54 (5) ◽

pp. 802-811 ◽

Cited By ~ 20

Author(s):

JingJing Liu ◽

WenSheng Cai ◽

XueGuang Shao

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Principal Component ◽

Microarray Gene Expression Data ◽

Cancer Classification ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene ◽

Accumulation Method

Download Full-text

Cancer classification by sparse representation using microarray gene expression data

2008 IEEE International Conference on Bioinformatics and Biomeidcine Workshops ◽

10.1109/bibmw.2008.4686232 ◽

2008 ◽

Cited By ~ 6

Author(s):

Xiyi Hang

Keyword(s):

Gene Expression ◽

Sparse Representation ◽

Gene Expression Data ◽

Microarray Gene Expression Data ◽

Cancer Classification ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

A Survey on Hybrid Feature Selection Methods in Microarray Gene Expression Data for Cancer Classification

IEEE Access ◽

10.1109/access.2019.2922987 ◽

2019 ◽

Vol 7 ◽

pp. 78533-78548 ◽

Cited By ~ 21

Author(s):

Nada Almugren ◽

Hala Alshamlan

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Expression Data ◽

Microarray Gene Expression Data ◽

Cancer Classification ◽

Expression Data ◽

Selection Methods ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

Hybrid Feature Selection Algorithm mRMR-ICA for Cancer Classification from Microarray Gene Expression Data

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207321666180601074349 ◽

2018 ◽

Vol 21 (6) ◽

pp. 420-430 ◽

Cited By ~ 4

Author(s):

Shuaiqun Wang ◽

Wei Kong ◽

Aorigele ◽

Jin Deng ◽

Shangce Gao ◽

...

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Expression Data ◽

Classification Accuracy ◽

Microarray Gene Expression Data ◽

Cancer Classification ◽

Expression Data ◽

Microarray Gene Expression ◽

Redundant Genes ◽

Microarray Gene

Aims and Objective: Redundant information of microarray gene expression data makes it difficult for cancer classification. Hence, it is very important for researchers to find appropriate ways to select informative genes for better identification of cancer. This study was undertaken to present a hybrid feature selection method mRMR-ICA which combines minimum redundancy maximum relevance (mRMR) with imperialist competition algorithm (ICA) for cancer classification in this paper. Materials and Methods: The presented algorithm mRMR-ICA utilizes mRMR to delete redundant genes as preprocessing and provide the small datasets for ICA for feature selection. It will use support vector machine (SVM) to evaluate the classification accuracy for feature genes. The fitness function includes classification accuracy and the number of selected genes. Results: Ten benchmark microarray gene expression datasets are used to test the performance of mRMR-ICA. Experimental results including the accuracy of cancer classification and the number of informative genes are improved for mRMR-ICA compared with the original ICA and other evolutionary algorithms. Conclusion: The comparison results demonstrate that mRMR-ICA can effectively delete redundant genes to ensure that the algorithm selects fewer informative genes to get better classification results. It also can shorten calculation time and improve efficiency.

Download Full-text

Cancer Classification by a Hybrid Method Using Microarray Gene Expression Data

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2015.4101 ◽

2015 ◽

Vol 12 (10) ◽

pp. 3194-3200 ◽

Cited By ~ 1

Author(s):

Bin Yu ◽

Yan Zhang ◽

Likuan Zhao

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Hybrid Method ◽

Microarray Gene Expression Data ◽

Cancer Classification ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

ANFIS-based wrapper model gene selection for cancer classification on microarray gene expression data

2013 13th Iranian Conference on Fuzzy Systems (IFSC) ◽

10.1109/ifsc.2013.6675687 ◽

2013 ◽

Cited By ~ 1

Author(s):

Sina Mahmoudi ◽

Biyuk Sadeghi Lahijan ◽

Hamidreza Rashidy Kanan

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Gene Selection ◽

Microarray Gene Expression Data ◽

Cancer Classification ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene ◽

Selection For

Download Full-text

ADDITIVE RISK ANALYSIS OF MICROARRAY GENE EXPRESSION DATA VIA CORRELATION PRINCIPAL COMPONENT REGRESSION

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720010004914 ◽

2010 ◽

Vol 08 (04) ◽

pp. 645-659 ◽

Cited By ~ 5

Author(s):

YICHUAN ZHAO ◽

GUOSHEN WANG

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Risk Model ◽

Principal Component Regression ◽

Principal Component ◽

Microarray Gene Expression Data ◽

Expression Data ◽

Microarray Gene Expression ◽

Data Set ◽

Additive Risk

In order to predict future patients' survival time based on their microarray gene expression data, one interesting question is how to relate genes to survival outcomes. In this paper, by applying a semi-parametric additive risk model in survival analysis, we propose a new approach to conduct a careful analysis of gene expression data with the focus on the model's predictive ability. In the proposed method, we apply the correlation principal component regression to deal with right censoring survival data under the semi-parametric additive risk model frame with high-dimensional covariates. We also employ the time-dependent area under the receiver operating characteristic curve and root mean squared error for prediction to assess how well the model can predict the survival time. Furthermore, the proposed method is able to identify significant genes, which are significantly related to the disease. Finally, the proposed useful approach is illustrated by the diffuse large B-cell lymphoma data set and breast cancer data set. The results show that the model fits the data sets very well.

Download Full-text