scholarly journals A meta-learning approach for genomic survival analysis

2020 ◽  
Author(s):  
Yeping Lina Qiu ◽  
Hong Zheng ◽  
Arnout Devos ◽  
Olivier Gevaert

AbstractRNA sequencing has emerged as a promising approach in cancer prognosis as sequencing data becomes more easily and affordably accessible. However, it remains challenging to build good predictive models especially when the sample size is limited and the number of features is high, which is a common situation in biomedical settings. To address these limitations, we propose a meta-learning framework based on neural networks for survival analysis and evaluate it in a genomic cancer research setting. We demonstrate that, compared to regular transfer-learning, meta-learning is a significantly more effective paradigm to leverage high-dimensional data that is relevant but not directly related to the problem of interest. Specifically, meta-learning explicitly constructs a model, from abundant data of relevant tasks, to learn a new task with few samples effectively. For the application of predicting cancer survival outcome, we also show that the meta-learning framework with a few samples is able to achieve competitive performance with learning from scratch with a significantly larger number of samples. Finally, we demonstrate that the meta-learning model implicitly prioritizes genes based on their contribution to survival prediction and allows us to identify important pathways in cancer.

2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Yeping Lina Qiu ◽  
Hong Zheng ◽  
Arnout Devos ◽  
Heather Selby ◽  
Olivier Gevaert

AbstractRNA sequencing has emerged as a promising approach in cancer prognosis as sequencing data becomes more easily and affordably accessible. However, it remains challenging to build good predictive models especially when the sample size is limited and the number of features is high, which is a common situation in biomedical settings. To address these limitations, we propose a meta-learning framework based on neural networks for survival analysis and evaluate it in a genomic cancer research setting. We demonstrate that, compared to regular transfer-learning, meta-learning is a significantly more effective paradigm to leverage high-dimensional data that is relevant but not directly related to the problem of interest. Specifically, meta-learning explicitly constructs a model, from abundant data of relevant tasks, to learn a new task with few samples effectively. For the application of predicting cancer survival outcome, we also show that the meta-learning framework with a few samples is able to achieve competitive performance with learning from scratch with a significantly larger number of samples. Finally, we demonstrate that the meta-learning model implicitly prioritizes genes based on their contribution to survival prediction and allows us to identify important pathways in cancer.


2021 ◽  
Author(s):  
Marie PAVAGEAU ◽  
Louis REBAUD ◽  
Daphne MOREL ◽  
Stergios CHRISTODOULIDIS ◽  
Eric DEUTSCH ◽  
...  

RNA sequencing (RNAseq) analysis offers a tumor centered approach of growing interest for personalizing cancer care. However, existing methods , including deep learning models, struggle to reach satisfying performances on survival prediction based upon pan-cancer RNAseq data. Here, we present DeepOS, a novel deep learning model that predicts overall survival (OS) from pancancer RNAseq with a concordance index of 0.715 and a survival AUC of 0.752 across 33 TCGA tumor types whilst tested on an unseen test cohort. DeepOS notably uses (i) prior biological knowledge to condense inputs dimensionality, (ii) transfer learning to enlarge its training capacity through pretraining on organ prediction, and (iii) mean squared error adapted to survival loss function; all of which contributed to improve the model performances. Interpretation showed that DeepOS learned biologically relevant prognosis biomarkers. Altogether, DeepOS achieved unprecedented and consistent performances on pan-cancer prognosis estimation from individual RNA-seq data.


Mathematics ◽  
2021 ◽  
Vol 9 (24) ◽  
pp. 3262
Author(s):  
Antonella Iuliano  ◽  
Annalisa Occhipinti  ◽  
Claudia Angelini  ◽  
Italia De De Feis  ◽  
Pietro Liò 

Identifying relevant genomic features that can act as prognostic markers for building predictive survival models is one of the central themes in medical research, affecting the future of personalized medicine and omics technologies. However, the high dimension of genome-wide omic data, the strong correlation among the features, and the low sample size significantly increase the complexity of cancer survival analysis, demanding the development of specific statistical methods and software. Here, we present a novel R package, COSMONET (COx Survival Methods based On NETworks), that provides a complete workflow from the pre-processing of omics data to the selection of gene signatures and prediction of survival outcomes. In particular, COSMONET implements (i) three different screening approaches to reduce the initial dimension of the data from a high-dimensional space p to a moderate scale d, (ii) a network-penalized Cox regression algorithm to identify the gene signature, (iii) several approaches to determine an optimal cut-off on the prognostic index (PI) to separate high- and low-risk patients, and (iv) a prediction step for patients’ risk class based on the evaluation of PIs. Moreover, COSMONET provides functions for data pre-processing, visualization, survival prediction, and gene enrichment analysis. We illustrate COSMONET through a step-by-step R vignette using two cancer datasets.


2019 ◽  
Vol 26 (1) ◽  
pp. 8-20 ◽  
Author(s):  
Yi Guo ◽  
Jiang Bian ◽  
Francois Modave ◽  
Qian Li ◽  
Thomas J George ◽  
...  

Cancer is the second leading cause of death in the United States. To improve cancer prognosis and survival rates, a better understanding of multi-level contributory factors associated with cancer survival is needed. However, prior research on cancer survival has primarily focused on factors from the individual level due to limited availability of integrated datasets. In this study, we sought to examine how data integration impacts the performance of cancer survival prediction models. We linked data from four different sources and evaluated the performance of Cox proportional hazard models for breast, lung, and colorectal cancers under three common data integration scenarios. We showed that adding additional contextual-level predictors to survival models through linking multiple datasets improved model fit and performance. We also showed that different representations of the same variable or concept have differential impacts on model performance. When building statistical models for cancer outcomes, it is important to consider cross-level predictor interactions.


2017 ◽  
pp. 1-13
Author(s):  
Yi Cui ◽  
Bailiang Li ◽  
Ruijiang Li

Purpose A significant hurdle in developing reliable gene expression–based prognostic models has been the limited sample size, which can cause overfitting and false discovery. Combining data from multiple studies can enhance statistical power and reduce spurious findings, but how to address the biologic heterogeneity across different datasets remains a major challenge. Better meta-survival analysis approaches are needed. Material and Methods We presented a decentralized learning framework for meta-survival analysis without the need for data aggregation. Our method consisted of a series of proposals that together alleviated the influence of data heterogeneity and improved the performance of survival prediction. First, we transformed the gene expression profile of every sample into normalized percentile ranks to obtain platform-agnostic features. Second, we used Stouffer’s meta-z approach in combination with Harrell’s concordance index to prioritize and select genes to be included in the model. Third, we used survival discordance as a scale-independent model loss function. Instead of generating a merged dataset and training the model therein, we avoided comparing patients across datasets and individually evaluated the loss function on each dataset. Finally, we optimized the model by minimizing the joint loss function. Results Through comprehensive evaluation on 31 public microarray datasets containing 6,724 samples of several cancer types, we demonstrated that the proposed method has outperformed (1) single prognostic genes identified using conventional meta-analysis, (2) multigene signatures trained on single datasets, (3) multigene signatures trained on merged datasets as well as by other existing meta-analysis methods, and (4) clinically applicable, established multigene signatures. Conclusion The decentralized learning approach can be used to effectively perform meta-analysis of gene expression data and to develop robust multigene prognostic signatures.


2020 ◽  
Author(s):  
Jiarui Feng ◽  
Heming Zhang ◽  
Fuhai Li

AbstractSurvival analysis and prediction are important in cancer studies. In addition to the Cox proportional hazards model, recently deep learning models have been proposed to integrate the multi-omics data for survival prediction. Cancer signaling pathways are important and interpretable concepts that define the signaling cascades regulating cancer development and drug resistance. Thus, it is interesting and important to investigate the relevance to patients’ survival of individual signaling pathways. In this exploratory study, we propose to investigate the relevance and difference of a small set of core cancer signaling pathways in the survival analysis of cancer patients. Specifically, we built a biologically meaningful and simplified deep neural network, DeepSigSurvNet, for survival prediction. In the model, the gene expression and copy number data of 1648 genes from 46 major signaling pathways are used. We applied the model on 4 types of cancer and investigated the relevance and difference of the 46 signaling pathways among the 4 types of cancer. Interestingly, the interpretable analysis identified the distinct patterns of these signaling pathways, which are helpful to understand the relevance of the signaling pathways in terms of their association with cancer survival time. These highly relevant signaling pathways can be novel targets, combined with other essential signaling pathways inhibitors, for drug and drug combination prediction to improve cancer patients’ survival time.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Erik van Dijk ◽  
Tom van den Bosch ◽  
Kristiaan J. Lenos ◽  
Khalid El Makrini ◽  
Lisanne E. Nijman ◽  
...  

AbstractSurvival rates of cancer patients vary widely within and between malignancies. While genetic aberrations are at the root of all cancers, individual genomic features cannot explain these distinct disease outcomes. In contrast, intra-tumour heterogeneity (ITH) has the potential to elucidate pan-cancer survival rates and the biology that drives cancer prognosis. Unfortunately, a comprehensive and effective framework to measure ITH across cancers is missing. Here, we introduce a scalable measure of chromosomal copy number heterogeneity (CNH) that predicts patient survival across cancers. We show that the level of ITH can be derived from a single-sample copy number profile. Using gene-expression data and live cell imaging we demonstrate that ongoing chromosomal instability underlies the observed heterogeneity. Analysing 11,534 primary cancer samples from 37 different malignancies, we find that copy number heterogeneity can be accurately deduced and predicts cancer survival across tissues of origin and stages of disease. Our results provide a unifying molecular explanation for the different survival rates observed between cancer types.


2021 ◽  
pp. 1-1
Author(s):  
Qi Liu ◽  
Xinyu Zhang ◽  
Yongxiang Liu ◽  
Kai Huo ◽  
Weidong Jiang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document