scholarly journals Microarray gene expression data with linked survival phenotypes: diffuse large-B-cell lymphoma revisited

Biostatistics ◽  
2005 ◽  
Vol 7 (2) ◽  
pp. 268-285 ◽  
Author(s):  
Mark R. Segal
2021 ◽  
Author(s):  
Mohamad Zamani-Ahmadmahmudi ◽  
Seyed Mahdi Nassiri ◽  
Amir Asadabadi

Abstract Gene expression profiling has been vastly used to extract the genes that can predict the clinical outcome in patients with diverse cancers, including diffuse large B-cell lymphoma (DLBCL). With the aid of bioinformatics and computational analysis on gene expression data, various prognostic gene signatures for DLBCL have been recently developed. The major drawback of the previous signatures is their inability to correctly predict survival in external data sets. In other words, they are not reproducible in other datasets. Hence, in this study, we sought to determine the gene(s) that can reproducibly and robustly predict survival in patients with DLBCL. Gene expression data were extracted from 7 datasets containing 1636 patients (GSE10846 [n=420], GSE31312 [n=470], GSE11318 [n=203], GSE32918 [n=172], GSE4475 [n=123], GSE69051 [n=157], and GSE34171 [n=91]). Genes significantly associated with overall survival were detected using the univariate Cox proportional hazards analysis with a P value <0.001 and a false discovery rate (FDR) <5%. Thereafter, significant genes common between all the datasets were extracted. Additionally, chromosomal aberrations in the corresponding region of the final common gene(s) were evaluated as copy number alterations using the single nucleotide polymorphism (SNP) data of 570 patients with DLBCL (GSE58718 [n=242], GSE57277 [n=148], and GSE34171 [n=180]). Our results indicated that reticulon family gene 1 (RTN1) was the only gene that met our rigorous pipeline criteria and associated with a favorable clinical outcome in all the datasets (P<0.001, FDR<5%). In the multivariate Cox proportional hazards analysis, this gene remained independent of the routine international prognostic index components (i.e., age, stage, lactate dehydrogenase level, Eastern Cooperative Oncology Group [ECOG] performance status, and number of extranodal sites) (P<0.0001). Furthermore, no significant chromosomal aberration was found in the RTN1 genomic region (14q23.1: Start 59,595,976/ End 59,870,966).


Author(s):  
Qiang Zhao ◽  
Jianguo Sun

Statistical analysis of microarray gene expression data has recently attracted a great deal of attention. One problem of interest is to relate genes to survival outcomes of patients with the purpose of building regression models for the prediction of future patients' survival based on their gene expression data. For this, several authors have discussed the use of the proportional hazards or Cox model after reducing the dimension of the gene expression data. This paper presents a new approach to conduct the Cox survival analysis of microarray gene expression data with the focus on models' predictive ability. The method modifies the correlation principal component regression (Sun, 1995) to handle the censoring problem of survival data. The results based on simulated data and a set of publicly available data on diffuse large B-cell lymphoma show that the proposed method works well in terms of models' robustness and predictive ability in comparison with some existing partial least squares approaches. Also, the new approach is simpler and easy to implement.


Sign in / Sign up

Export Citation Format

Share Document