scholarly journals Long-term cancer survival prediction using multimodal deep learning

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Luís A. Vale-Silva ◽  
Karl Rohr

AbstractThe age of precision medicine demands powerful computational techniques to handle high-dimensional patient data. We present MultiSurv, a multimodal deep learning method for long-term pan-cancer survival prediction. MultiSurv uses dedicated submodels to establish feature representations of clinical, imaging, and different high-dimensional omics data modalities. A data fusion layer aggregates the multimodal representations, and a prediction submodel generates conditional survival probabilities for follow-up time intervals spanning several decades. MultiSurv is the first non-linear and non-proportional survival prediction method that leverages multimodal data. In addition, MultiSurv can handle missing data, including single values and complete data modalities. MultiSurv was applied to data from 33 different cancer types and yields accurate pan-cancer patient survival curves. A quantitative comparison with previous methods showed that Multisurv achieves the best results according to different time-dependent metrics. We also generated visualizations of the learned multimodal representation of MultiSurv, which revealed insights on cancer characteristics and heterogeneity.

2020 ◽  
Author(s):  
Luis Andre Vale-Silva ◽  
Karl Rohr

The age of precision medicine demands powerful computational techniques to handle high-dimensional patient data. We present MultiSurv, a multimodal deep learning method for long-term pan-cancer survival prediction. MultiSurv is composed of three main modules. A feature representation module includes a dedicated submodel for each input data modality. A data fusion layer aggregates the multimodal representations. Finally, a prediction submodel yields conditional survival probabilities for a predefined set of follow-up time intervals. We trained MultiSurv on clinical, imaging, and four different high-dimensional omics data modalities from patients diagnosed with one of 33 different cancer types. We evaluated unimodal input configurations against several previous methods and different multimodal data combinations. MultiSurv achieved the best results according to different time-dependent metrics and delivered highly accurate long-term patient survival curves. The best performance was obtained when combining clinical information with either gene expression or DNA methylation data, depending on the evaluation metric. Additionally, MultiSurv can handle missing data, including missing values and complete data modalities. Interestingly, for unimodal data we found that simpler modeling approaches, including the classical Cox proportional hazards method, can achieve results rivaling those of more complex methods for certain data modalities. We also show how the learned feature representations of MultiSurv can be used to visualize relationships between cancer types and individual patients, after embedding into a low-dimensional space.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Erik van Dijk ◽  
Tom van den Bosch ◽  
Kristiaan J. Lenos ◽  
Khalid El Makrini ◽  
Lisanne E. Nijman ◽  
...  

AbstractSurvival rates of cancer patients vary widely within and between malignancies. While genetic aberrations are at the root of all cancers, individual genomic features cannot explain these distinct disease outcomes. In contrast, intra-tumour heterogeneity (ITH) has the potential to elucidate pan-cancer survival rates and the biology that drives cancer prognosis. Unfortunately, a comprehensive and effective framework to measure ITH across cancers is missing. Here, we introduce a scalable measure of chromosomal copy number heterogeneity (CNH) that predicts patient survival across cancers. We show that the level of ITH can be derived from a single-sample copy number profile. Using gene-expression data and live cell imaging we demonstrate that ongoing chromosomal instability underlies the observed heterogeneity. Analysing 11,534 primary cancer samples from 37 different malignancies, we find that copy number heterogeneity can be accurately deduced and predicts cancer survival across tissues of origin and stages of disease. Our results provide a unifying molecular explanation for the different survival rates observed between cancer types.


2020 ◽  
Vol 19 (1) ◽  
pp. 117-126
Author(s):  
Chunyu Wang ◽  
Junling Guo ◽  
Ning Zhao ◽  
Yang Liu ◽  
Xiaoyan Liu ◽  
...  

Author(s):  
Xiangtao Li ◽  
Shaochuan Li ◽  
Yunhe Wang ◽  
Shixiong Zhang ◽  
Ka-Chun Wong

Abstract The identification of hidden responders is often an essential challenge in precision oncology. A recent attempt based on machine learning has been proposed for classifying aberrant pathway activity from multiomic cancer data. However, we note several critical limitations there, such as high-dimensionality, data sparsity and model performance. Given the central importance and broad impact of precision oncology, we propose nature-inspired deep Ras activation pan-cancer (NatDRAP), a deep neural network (DNN) model, to address those restrictions for the identification of hidden responders. In this study, we develop the nature-inspired deep learning model that integrates bulk RNA sequencing, copy number and mutation data from PanCanAltas to detect pan-cancer Ras pathway activation. In NatDRAP, we propose to synergize the nature-inspired artificial bee colony algorithm with different gradient-based optimizers in one framework for optimizing DNNs in a collaborative manner. Multiple experiments were conducted on 33 different cancer types across PanCanAtlas. The experimental results demonstrate that the proposed NatDRAP can provide superior performance over other benchmark methods with strong robustness towards diagnosing RAS aberrant pathway activity across different cancer types. In addition, gene ontology enrichment and pathological analysis are conducted to reveal novel insights into the RAS aberrant pathway activity identification and characterization. NatDRAP is written in Python and available at https://github.com/lixt314/NatDRAP1.


2021 ◽  
Vol 4 ◽  
Author(s):  
Hamid Reza Hassanzadeh ◽  
May D. Wang

As a highly sophisticated disease that humanity faces, cancer is known to be associated with dysregulation of cellular mechanisms in different levels, which demands novel paradigms to capture informative features from different omics modalities in an integrated way. Successful stratification of patients with respect to their molecular profiles is a key step in precision medicine and in tailoring personalized treatment for critically ill patients. In this article, we use an integrated deep belief network to differentiate high-risk cancer patients from the low-risk ones in terms of the overall survival. Our study analyzes RNA, miRNA, and methylation molecular data modalities from both labeled and unlabeled samples to predict cancer survival and subsequently to provide risk stratification. To assess the robustness of our novel integrative analytics, we utilize datasets of three cancer types with 836 patients and show that our approach outperforms the most successful supervised and semi-supervised classification techniques applied to the same cancer prediction problems. In addition, despite the preconception that deep learning techniques require large size datasets for proper training, we have illustrated that our model can achieve better results for moderately sized cancer datasets.


2021 ◽  
Author(s):  
Marie PAVAGEAU ◽  
Louis REBAUD ◽  
Daphne MOREL ◽  
Stergios CHRISTODOULIDIS ◽  
Eric DEUTSCH ◽  
...  

RNA sequencing (RNAseq) analysis offers a tumor centered approach of growing interest for personalizing cancer care. However, existing methods , including deep learning models, struggle to reach satisfying performances on survival prediction based upon pan-cancer RNAseq data. Here, we present DeepOS, a novel deep learning model that predicts overall survival (OS) from pancancer RNAseq with a concordance index of 0.715 and a survival AUC of 0.752 across 33 TCGA tumor types whilst tested on an unseen test cohort. DeepOS notably uses (i) prior biological knowledge to condense inputs dimensionality, (ii) transfer learning to enlarge its training capacity through pretraining on organ prediction, and (iii) mean squared error adapted to survival loss function; all of which contributed to improve the model performances. Interpretation showed that DeepOS learned biologically relevant prognosis biomarkers. Altogether, DeepOS achieved unprecedented and consistent performances on pan-cancer prognosis estimation from individual RNA-seq data.


2021 ◽  
Vol 2 (4) ◽  
pp. 1-18
Author(s):  
Zhaohong Sun ◽  
Wei Dong ◽  
Jinlong Shi ◽  
Kunlun He ◽  
Zhengxing Huang

Survival analysis exhibits profound effects on health service management. Traditional approaches for survival analysis have a pre-assumption on the time-to-event probability distribution and seldom consider sequential visits of patients on medical facilities. Although recent studies leverage the merits of deep learning techniques to capture non-linear features and long-term dependencies within multiple visits for survival analysis, the lack of interpretability prevents deep learning models from being applied to clinical practice. To address this challenge, this article proposes a novel attention-based deep recurrent model, named AttenSurv , for clinical survival analysis. Specifically, a global attention mechanism is proposed to extract essential/critical risk factors for interpretability improvement. Thereafter, Bi-directional Long Short-Term Memory is employed to capture the long-term dependency on data from a series of visits of patients. To further improve both the prediction performance and the interpretability of the proposed model, we propose another model, named GNNAttenSurv , by incorporating a graph neural network into AttenSurv, to extract the latent correlations between risk factors. We validated our solution on three public follow-up datasets and two electronic health record datasets. The results demonstrated that our proposed models yielded consistent improvement compared to the state-of-the-art baselines on survival analysis.


Cancers ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 3047
Author(s):  
Xiaoyu Zhang ◽  
Yuting Xing ◽  
Kai Sun ◽  
Yike Guo

High-dimensional omics data contain intrinsic biomedical information that is crucial for personalised medicine. Nevertheless, it is challenging to capture them from the genome-wide data, due to the large number of molecular features and small number of available samples, which is also called “the curse of dimensionality” in machine learning. To tackle this problem and pave the way for machine learning-aided precision medicine, we proposed a unified multi-task deep learning framework named OmiEmbed to capture biomedical information from high-dimensional omics data with the deep embedding and downstream task modules. The deep embedding module learnt an omics embedding that mapped multiple omics data types into a latent space with lower dimensionality. Based on the new representation of multi-omics data, different downstream task modules were trained simultaneously and efficiently with the multi-task strategy to predict the comprehensive phenotype profile of each sample. OmiEmbed supports multiple tasks for omics data including dimensionality reduction, tumour type classification, multi-omics integration, demographic and clinical feature reconstruction, and survival prediction. The framework outperformed other methods on all three types of downstream tasks and achieved better performance with the multi-task strategy compared to training them individually. OmiEmbed is a powerful and unified framework that can be widely adapted to various applications of high-dimensional omics data and has great potential to facilitate more accurate and personalised clinical decision making.


PLoS ONE ◽  
2020 ◽  
Vol 15 (6) ◽  
pp. e0233678 ◽  
Author(s):  
Ellery Wulczyn ◽  
David F. Steiner ◽  
Zhaoyang Xu ◽  
Apaar Sadhwani ◽  
Hongwu Wang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document