scholarly journals Development and Clinical Validation of a Seven-Gene Prognostic Signature Based on Multiple Machine Learning Algorithms in Kidney Cancer

2021 ◽  
Vol 30 ◽  
pp. 096368972096917
Author(s):  
Mi Tian ◽  
Tao Wang ◽  
Peng Wang

About a third of patients with kidney cancer experience recurrence or cancer-related progression. Clinically, kidney cancer prognoses may be quite different, even in patients with kidney cancer at the same clinical stage. Therefore, there is an urgent need to screen for kidney cancer prognosis biomarkers. Differentially expressed genes (DEGs) were identified using kidney cancer RNA sequencing data from the Gene Expression Omnibus (GEO) database. Biomarkers were screened using random forest (RF) and support vector machine (SVM) models, and a multigene signature was constructed using the least absolute shrinkage and selection operator (LASSO) regression analysis. Univariate and multivariate Cox regression analyses were performed to explore the relationships between clinical features and prognosis. Finally, the reliability and clinical applicability of the model were validated, and relationships with biological pathways were identified. Western blots were also performed to evaluate gene expression. A total of 50 DEGs were obtained by intersecting the RF and SVM models. A seven-gene signature (RNASET2, EZH2, FXYD5, KIF18A, NAT8, CDCA7, and WNT7B) was constructed by LASSO regression. Univariate and multivariate Cox regression analyses showed that the seven-gene signature was an independent prognostic factor for kidney cancer. Finally, a predictive nomogram was established in The Cancer Genome Atlas (TCGA) cohort and validated internally. In tumor tissue, RNASET2 and FXYD5 were highly expressed and NAT8 was lowly expressed at the protein and transcription levels. This model could complement the clinicopathological characteristics of kidney cancer and promote the personalized management of patients with kidney cancer.

2021 ◽  
Vol 12 ◽  
Author(s):  
Yunfei Dong ◽  
Tao Shang ◽  
HaiXin Ji ◽  
Xiukou Zhou ◽  
Zhi Chen

BackgroundThe pathological stage of colon cancer cannot accurately predict recurrence, and to date, no gene expression characteristics have been demonstrated to be reliable for prognostic stratification in clinical practice, perhaps because colon cancer is a heterogeneous disease. The purpose was to establish a comprehensive molecular classification and prognostic marker for colon cancer based on invasion-related expression profiling.MethodsFrom the Gene Expression Omnibus (GEO) database, we collected two microarray datasets of colon cancer samples, and another dataset was obtained from The Cancer Genome Atlas (TCGA). Differentially expressed genes (DEGs) further underwent univariate analysis, least absolute shrinkage, selection operator (LASSO) regression analysis, and multivariate Cox survival analysis to screen prognosis-associated feature genes, which were further verified with test datasets.ResultsTwo molecular subtypes (C1 and C2) were identified based on invasion-related genes in the colon cancer samples in TCGA training dataset, and C2 had a good prognosis. Moreover, C1 was more sensitive to immunotherapy. A total of 1,514 invasion-related genes, specifically 124 downregulated genes and 1,390 upregulated genes in C1 and C2, were identified as DEGs. A four-gene prognostic signature was identified and validated, and colon cancer patients were stratified into a high-risk group and a low-risk group. Multivariate regression analyses and a nomogram indicated that the four-gene signature developed in this study was an independent predictive factor and had a relatively good predictive capability when adjusting for other clinical factors.ConclusionThis research provided novel insights into the mechanisms underlying invasion and offered a novel biomarker of a poor prognosis in colon cancer patients.


2020 ◽  
Vol 11 ◽  
Author(s):  
Fei Ye ◽  
Jie Liang ◽  
Jiaoxing Li ◽  
Haiyan Li ◽  
Wenli Sheng

Background: Multiple sclerosis (MS) is an inflammatory and demyelinating disease of the central nervous system with a variable natural history of relapse and remission. Previous studies have found many differentially expressed genes (DEGs) in the peripheral blood of MS patients and healthy controls, but the value of these genes for predicting the risk of relapse remains elusive. Here we develop and validate an effective and noninvasive gene signature for predicting relapse-free survival (RFS) in MS patients.Methods: Gene expression matrices were downloaded from Gene Expression Omnibus and ArrayExpress. DEGs in MS patients and healthy controls were screened in an integrated analysis of seven data sets. Candidate genes from a combination of protein–protein interaction and weighted correlation network analysis were used to identify key genes related to RFS. An independent data set (GSE15245) was randomized into training and test groups. Univariate and least absolute shrinkage and selection operator–Cox regression analyses were used in the training group to develop a gene signature. A nomogram incorporating independent risk factors was developed via multivariate Cox regression analyses. Kaplan–Meier methods, receiver-operating characteristic (ROC) curves, and Harrell's concordance index (C-index) were used to estimate the performance of the gene signature and nomogram. The test group was used for external validation.Results: A five-gene signature comprising FTH1, GBP2, MYL6, NCOA4, and SRP9 was used to calculate risk scores to predict individual RFS. The risk score was an independent risk factor, and a nomogram incorporating clinical parameters was established. ROC curves and C-indices demonstrated great performance of these predictive tools in both the training and test groups.Conclusions: The five-gene signature may be a reliable tool for assisting physicians in predicting RFS in clinical practice. We anticipate that these findings could not only facilitate personalized treatment for MS patients but also provide insight into the complex molecular mechanism of this disease.


2020 ◽  
Author(s):  
Guanbao Zhou ◽  
Genjie Lu ◽  
Liang Yang ◽  
Yangfang Lu

Abstract Background: Hepatocellular carcinoma (HCC) is the most common type of liver cancer with relatively poor prognosis. Thus, we aimed to identify novel molecular biomarkers to effectively predict the prognosis of HCC patients and eventually guide treatment. Methods: Prognosis-associated genes were determined by Kaplan-Meier and multivariate Cox regression analyses using the expression and clinical data of 373 HCC patients from The Cancer Genome Atlas (TCGA) database and validated in an independent Gene Expression Omnibus (GEO) dataset. The classification of AML was performed by unsupervised hierarchical clustering of ten gene expression levels. A prognostic risk score was established based on a linear combination of ten gene expression levels using the regression coefficients derived from the multivariate Cox regression models. Results: A total of 183 genes were significantly associated with prognosis in HCC. SLC25A15, RAB8A, GOT2, SORBS2, IL18RAP were top five protective genes, while FHL3, AMD1, DCAF13, UBE2E1, PTDSS2 were top five risk genes in HCC. SLC25A15, GOT2, IL18RAP were significantly down-regulated and DCAF13, PTDSS2 and SORBS2 were significantly up-regulated in the HCC samples and these genes exhibited high accuracy in differentiating HCC tissues from normal liver tissues. Hierarchical clustering analysis of the ten genes discovered three clusters of HCC patients. HCC tumors of cluster1 and 2 were significantly associated with more favourable OS than those of cluster3, cluster2 tumors showed higher pathologic stage than cluster3 tumors. The risk score was predictive of increased mortality rate in HCC patients. Conclusions: The ten-gene signature and the risk score may turn out to be novel molecular biomarkers and stratification of HCC patients to considerably ameliorate the prognostic prediction.


2021 ◽  
Vol 8 ◽  
Author(s):  
Yuren Xia ◽  
Xin Li ◽  
Xiangdong Tian ◽  
Qiang Zhao

Background: Neuroblastoma (NB), the most common solid tumor in children, exhibits vastly different genomic abnormalities and clinical behaviors. While significant progress has been made on the research of relations between clinical manifestations and genetic abnormalities, it remains a major challenge to predict the prognosis of patients to facilitate personalized treatments.Materials and Methods: Six data sets of gene expression and related clinical data were downloaded from the Gene Expression Omnibus (GEO) database, ArrayExpress database, and Therapeutically Applicable Research to Generate Effective Treatments (TARGET) database. According to the presence or absence of MYCN amplification, patients were divided into two groups. Differentially expressed genes (DEGs) were identified between the two groups. Enrichment analyses of these DEGs were performed to dig further into the molecular mechanism of NB. Stepwise Cox regression analyses were used to establish a five-gene prognostic signature whose predictive performance was further evaluated by external validation. Multivariate Cox regression analyses were used to explore independent prognostic factors for NB. The relevance of immunity was evaluated by using algorithms, and a nomogram was constructed.Results: A five-gene signature comprising CPLX3, GDPD5, SPAG6, NXPH1, and AHI1 was established. The five-gene signature had good performance in predicting survival and was demonstrated to be superior to International Neuroblastoma Staging System (INSS) staging and the MYCN amplification status. Finally, a nomogram based on the five-gene signature was established, and its clinical efficacy was demonstrated.Conclusion: Collectively, our study developed a novel five-gene signature and successfully built a prognostic nomogram that accurately predicted survival in NB. The findings presented here could help to stratify patients into subgroups and determine the optimal individualized therapy.


2019 ◽  
Vol 26 (1) ◽  
pp. 107327481985511 ◽  
Author(s):  
Qi Wang ◽  
Zongze He ◽  
Yong Chen

Low-grade gliomas (LGGs) are a highly heterogeneous group of slow-growing, lethal, diffusive brain tumors. Temozolomide (TMZ) is a frequently used primary chemotherapeutic agent for LGGs. Currently there is no consensus as to the optimal biomarkers to predict the efficacy of TMZ, which calls for decision-making for each patient while considering molecular profiles. Low-grade glioma data sets were retrieved from The Cancer Genome Atlas. Cox regression and survival analyses were applied to identify clinical features significantly associated with survival. Subsequently, Ordinal logistic regression, co-expression, and Cox regression analyses were applied to identify genes that correlate significantly with response rate, disease-free survival, and overall survival of patients receiving TMZ as primary therapy. Finally, gene expression and methylation analyses were exploited to explain the mechanism between these gene expression and TMZ efficacy in LGG patients. Overall survival was significantly correlated with age, Karnofsky Performance Status score, and histological grade, but not with IDH1 mutation status. Using 3 distinct efficacy end points, regression and co-expression analyses further identified a novel 4-gene signature of ASPM, CCNB1, EXO1, and KIF23 which negatively correlated with response to TMZ therapy. In addition, expression of the 4-gene signature was associated with those of genes involved in homologous recombination. Finally, expression and methylation profiling identified a largely unknown olfactory receptor OR51F2 as potential mediator of the roles of the 4-gene signature in reducing TMZ efficacy. Taken together, these findings propose the 4-gene signature as a novel panel of efficacy predictors of TMZ therapy, as well as potential downstream mechanisms, including homologous recombination, OR51F2, and DNA methylation independent of MGMT.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jixin Wang ◽  
Xiangjun Yin ◽  
Yin-Qiang Zhang ◽  
Xuming Ji

Lung adenocarcinoma (LUAD) is a major subtype of lung cancer, the prognosis of patients with which is associated with both lncRNAs and cancer immunity. In this study, we collected gene expression data of 585 LUAD patients from The Cancer Genome Atlas (TCGA) database and 605 subjects from the Gene Expression Omnibus (GEO) database. LUAD patients were divided into high and low immune-cell-infiltrated groups according to the single sample gene set enrichment analysis (ssGSEA) algorithm to identify differentially expressed genes (DEGs). Based on the 49 immune-related DE lncRNAs, a four-lncRNA prognostic signature was constructed by applying least absolute shrinkage and selection operator (LASSO) regression, univariate Cox regression, and stepwise multivariate Cox regression in sequence. Kaplan–Meier curve, ROC analysis, and the testing GEO datasets verified the effectiveness of the signature in predicting overall survival (OS). Univariate Cox regression and multivariate Cox regression suggested that the signature was an independent prognostic factor. The correlation analysis revealed that the infiltration immune cell subtypes were related to these lncRNAs.


2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Derui Yan ◽  
Mingjing Shen ◽  
Zixuan Du ◽  
Jianping Cao ◽  
Ye Tian ◽  
...  

Adjuvant radiotherapy is one of the main treatment methods for breast cancer, but its clinical benefit depends largely on the characteristics of the patient. This study aimed to explore the relationship between the expression of zinc finger (ZNF) gene family proteins and the radiosensitivity of breast cancer patients. Clinical and gene expression data on a total of 976 breast cancer samples were obtained from The Cancer Genome Atlas (TCGA) database. ZNF gene expression was dichotomized into groups with a higher or lower level than the median level of expression. Univariate and multivariate Cox regression analyses were used to evaluate the relationship between ZNF gene expression levels and radiosensitivity. The Molecular Taxonomy Data of the International Federation of Breast Cancer (METABRIC) database was used for validation. The results revealed that 4 ZNF genes were possible radiosensitivity markers. High expression of ZNF644 and low expression levels of the other 3 genes (ZNF341, ZNF541, and ZNF653) were related to the radiosensitivity of breast cancer. Hierarchical cluster, Cox, and CoxBoost analysis based on these 4 ZNF genes indicated that patients with a favorable 4-gene signature had better overall survival on radiotherapy. Thus, this 4-gene signature may have value for selecting those patients most likely to benefit from radiotherapy. ZNF gene clusters could act as radiosensitivity signatures for breast cancer patients and may be involved in determining the radiosensitivity of cancer.


2020 ◽  
Author(s):  
Zengwei Tang ◽  
Xing Hunag ◽  
Enliang Li ◽  
Yinan Shen ◽  
Qi Zhang ◽  
...  

Abstract Background: Intrahepatic cholangiocarcinoma (iCCA) patients have poor outcomes due to the lack of biomarkers for the selection of treatment options. The present study was conducted to find biomarkers with independent prognostic vaule in iCCA patients. Methods: Gene transcriptome profiles of E-MTAB-6389, TCGA-CHOL and GSE26566 were obtained from ArrayExpress, The Cancer Genome Atlas and the Gene Expression Omnibus databases, respectively. Bioinformatic analyses were performed to screen novel biomarkers for predicting the prognosis of iCCA patients. Using multivariate Cox regression analyses, a 3-gene signature (BTD-FER-COL12A1) with potential prognostic value was identified and validated in both a training cohort and two validation cohorts. Results: A total of 177 iCCA patients were included in this study. From the key gene modules significantly associated with liver cirrhosis and overall survival (OS) of iCCA patients, we identified 89 hub genes for functional analyses. Cox-regression analyses in both the training and validation cohort indicate that FER, COL12A1 and BTD were independent risk factors for iCCA patients. A 3-gene signature (BTD-FER-COL12A1) with independent prognostic value in iCCA patients was validated in the training cohort, as well as in two validation cohorts. In terms of predicting the prognosis of iCCA patients, the receiver operating characteristics (ROC) curves showed that this 3-gene signature had superior prediction power to BTD, FER, and COL12A1 alone, as well as known biomarkers (MUC1, MUC13) of iCCA. Immunohistochemical staining of samples from The Human Protein Atlas showed that FER and COL12A1 were positively expressed in iCCA tissue, although BTD was not, while none of these genes was detected in normal tissue. These findings were consistent with the expression status of BTD, FER and COL12A1 at the transcriptional level. In addition, we found that FER and COL12A1 were significantly associated with the degree of infiltration by tumor-infiltrating immune cells. Conclusion: We discovered a three-gene signature with independent prognostic value as a novel biomarker for prediction prognosis of iCCA patients. Our findings may help to find novel therapeutic targets for precision treatment of iCCA.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Feng Jiang ◽  
Chuyan Wu ◽  
Ming Wang ◽  
Ke Wei ◽  
Jimei Wang

AbstractOne of the most frequently identified tumors and a contributing cause of death in women is breast cancer (BC). Many biomarkers associated with survival and prognosis were identified in previous studies through database mining. Nevertheless, the predictive capabilities of single-gene biomarkers are not accurate enough. Genetic signatures can be an enhanced prediction method. This research analyzed data from The Cancer Genome Atlas (TCGA) for the detection of a new genetic signature to predict BC prognosis. Profiling of mRNA expression was carried out in samples of patients with TCGA BC (n = 1222). Gene set enrichment research has been undertaken to classify gene sets that vary greatly between BC tissues and normal tissues. Cox models for additive hazards regression were used to classify genes that were strongly linked to overall survival. A subsequent Cox regression multivariate analysis was used to construct a predictive risk parameter model. Kaplan–Meier survival predictions and log-rank validation have been used to verify the value of risk prediction parameters. Seven genes (PGK1, CACNA1H, IL13RA1, SDC1, AK3, NUP43, SDC3) correlated with glycolysis were shown to be strongly linked to overall survival. Depending on the 7-gene-signature, 1222 BC patients were classified into subgroups of high/low-risk. Certain variables have not impaired the prognostic potential of the seven-gene signature. A seven-gene signature correlated with cellular glycolysis was developed to predict the survival of BC patients. The results include insight into cellular glycolysis mechanisms and the detection of patients with poor BC prognosis.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Wanting Song ◽  
Yi Bai ◽  
Jialin Zhu ◽  
Fanxin Zeng ◽  
Chunmeng Yang ◽  
...  

Abstract Background Gastric cancer (GC) represents a major malignancy and is the third deathliest cancer globally. Several lines of evidence indicate that the epithelial-mesenchymal transition (EMT) has a critical function in the development of gastric cancer. Although plentiful molecular biomarkers have been identified, a precise risk model is still necessary to help doctors determine patient prognosis in GC. Methods Gene expression data and clinical information for GC were acquired from The Cancer Genome Atlas (TCGA) database and 200 EMT-related genes (ERGs) from the Molecular Signatures Database (MSigDB). Then, ERGs correlated with patient prognosis in GC were assessed by univariable and multivariable Cox regression analyses. Next, a risk score formula was established for evaluating patient outcome in GC and validated by survival and ROC curves. In addition, Kaplan-Meier curves were generated to assess the associations of the clinicopathological data with prognosis. And a cohort from the Gene Expression Omnibus (GEO) database was used for validation. Results Six EMT-related genes, including CDH6, COL5A2, ITGAV, MATN3, PLOD2, and POSTN, were identified. Based on the risk model, GC patients were assigned to the high- and low-risk groups. The results revealed that the model had good performance in predicting patient prognosis in GC. Conclusions We constructed a prognosis risk model for GC. Then, we verified the performance of the model, which may help doctors predict patient prognosis.


Sign in / Sign up

Export Citation Format

Share Document