scholarly journals Performance Comparison of Deep Learning Autoencoders for Cancer Subtype Detection Using Multi-Omics Data

Cancers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 2013
Author(s):  
Edian F. Franco ◽  
Pratip Rana ◽  
Aline Cruz ◽  
Víctor V. Calderón ◽  
Vasco Azevedo ◽  
...  

A heterogeneous disease such as cancer is activated through multiple pathways and different perturbations. Depending upon the activated pathway(s), the survival of the patients varies significantly and shows different efficacy to various drugs. Therefore, cancer subtype detection using genomics level data is a significant research problem. Subtype detection is often a complex problem, and in most cases, needs multi-omics data fusion to achieve accurate subtyping. Different data fusion and subtyping approaches have been proposed over the years, such as kernel-based fusion, matrix factorization, and deep learning autoencoders. In this paper, we compared the performance of different deep learning autoencoders for cancer subtype detection. We performed cancer subtype detection on four different cancer types from The Cancer Genome Atlas (TCGA) datasets using four autoencoder implementations. We also predicted the optimal number of subtypes in a cancer type using the silhouette score and found that the detected subtypes exhibit significant differences in survival profiles. Furthermore, we compared the effect of feature selection and similarity measures for subtype detection. For further evaluation, we used the Glioblastoma multiforme (GBM) dataset and identified the differentially expressed genes in each of the subtypes. The results obtained are consistent with other genomic studies and can be corroborated with the involved pathways and biological functions. Thus, it shows that the results from the autoencoders, obtained through the interaction of different datatypes of cancer, can be used for the prediction and characterization of patient subgroups and survival profiles.

Author(s):  
Edian F. Franco ◽  
Pratip Rana ◽  
Aline Cruz ◽  
Victor V. Calderon ◽  
Vasco Azevedo ◽  
...  

A heterogeneous disease like cancer is activated through multiple pathways and different perturbations. Depending upon the activated pathway(s), patients’ survival vary significantly and show different efficacy to various drugs. Therefore, cancer subtype detection using genomics level data is a significant research problem. Subtype detection is often a complex problem, and in most cases, needs multi-omics data fusion to achieve accurate subtyping. Different data fusion and subtyping approaches have been proposed, such as kernel-based fusion, matrix factorization, and deep learning autoencoders. In this paper, we compared the performance of different deep learning autoencoders for cancer subtype detection. We performed cancer subtype detection on four different cancer types from The Cancer Genome Atlas (TCGA) datasets using four autoencoder implementations. We also predicted the optimal number of subtypes in a cancer type using the silhouette score. We observed that the detected subtypes exhibit significant differences in survival profiles. Furthermore, we also compared the effect of feature selection and similarity measures for subtype detection. To evaluate the results obtained, we selected the Glioblastoma multiforme (GBM) dataset and identified the differentially expressed genes in each of the subtypes identified by the autoencoders; the obtained results coincide well with other genomic studies and can be corroborated with the involved pathways and biological functions. Thus, it shows that the results from the autoencoders, obtained through the interaction of different datatypes of cancer, can be used for the prediction and characterization of patient subgroups and survival profiles.


Gut ◽  
2020 ◽  
pp. gutjnl-2020-320930 ◽  
Author(s):  
Jie-Yi Shi ◽  
Xiaodong Wang ◽  
Guang-Yu Ding ◽  
Zhou Dong ◽  
Jing Han ◽  
...  

ObjectiveTumour pathology contains rich information, including tissue structure and cell morphology, that reflects disease progression and patient survival. However, phenotypic information is subtle and complex, making the discovery of prognostic indicators from pathological images challenging.DesignAn interpretable, weakly supervised deep learning framework incorporating prior knowledge was proposed to analyse hepatocellular carcinoma (HCC) and explore new prognostic phenotypes on pathological whole-slide images (WSIs) from the Zhongshan cohort of 1125 HCC patients (2451 WSIs) and TCGA cohort of 320 HCC patients (320 WSIs). A ‘tumour risk score (TRS)’ was established to evaluate patient outcomes, and then risk activation mapping (RAM) was applied to visualise the pathological phenotypes of TRS. The multi-omics data of The Cancer Genome Atlas(TCGA) HCC were used to assess the potential pathogenesis underlying TRS.ResultsSurvival analysis revealed that TRS was an independent prognosticator in both the Zhongshan cohort (p<0.0001) and TCGA cohort (p=0.0003). The predictive ability of TRS was superior to and independent of clinical staging systems, and TRS could evenly stratify patients into up to five groups with significantly different prognoses. Notably, sinusoidal capillarisation, prominent nucleoli and karyotheca, the nucleus/cytoplasm ratio and infiltrating inflammatory cells were identified as the main underlying features of TRS. The multi-omics data of TCGA HCC hint at the relevance of TRS to tumour immune infiltration and genetic alterations such as the FAT3 and RYR2 mutations.ConclusionOur deep learning framework is an effective and labour-saving method for decoding pathological images, providing a valuable means for HCC risk stratification and precise patient treatment.


2019 ◽  
Author(s):  
Hua Chai ◽  
Xiang Zhou ◽  
Zifeng Cui ◽  
Jiahua Rao ◽  
Zheng Hu ◽  
...  

AbstractMotivationAccurately predicting cancer prognosis is necessary to choose precise strategies of treatment for patients. One of effective approaches in the prediction is the integration of multi-omics data, which reduces the impact of noise within single omics data. However, integrating multi-omics data brings large number of redundant variables and relative small sample sizes. In this study, we employed Autoencoder networks to extract important features that were then input to the proportional hazards model to predict the cancer prognosis.ResultsThe method was applied to 12 common cancers from the Cancer Genome Atlas. The results show that the multi-omics averagely improves 4.1% C-index for prognosis prediction over single mRNA data, and our method outperforms previous approaches by at least 7.4%. A comparison of the contribution of single omics data show that mRNA contributes the most, followed by the DNA methylation, miRNA, and the copy number variation. In the case study for differential gene expression analysis, we identified 161 differentially expressed genes in the cervical cancer, among which 77 genes (65.8%) have been proven to be associated with cancer. In addition, we performed the cross-cancer test where the model trained on one cancer was used to predict the prognosis of another cancer, and found 23 pairs of cancers have a C-index larger than 0.5, with the largest value of 0.68. Thus, this study has provided a deep learning framework to effectively integrate multiple omics data to predict cancer prognosis.


2020 ◽  
Author(s):  
Chi Xu ◽  
Denghui Liu ◽  
Lei Zhang ◽  
Zhimeng Xu ◽  
Wenjun He ◽  
...  

AbstractDeep learning is very promising in solving problems in omics research, such as genomics, epigenomics, proteomics, and metabolics. The design of neural network architecture is very important in modeling omics data against different scientific problems. Residual fully-connected neural network (RFCN) was proposed to provide better neural network architectures for modeling omics data. The next challenge for omics research is how to integrate informations from different omics data using deep learning, so that information from different molecular system levels could be combined to predict the target. In this paper, we present a novel multimodal approach that could efficiently integrate information from different omics data and achieve better accuracy than previous approaches. We evaluate our method in four different tasks: drug repositioning, target gene prediction, breast cancer subtyping and cancer type prediction, and all the four tasks achieved state of art performances. The multimodal approach is implemented in AutoGenome V2 and is also powered with all the previous AutoML convenience to facilitate biomedical researchers.


2019 ◽  
Vol 35 (19) ◽  
pp. 3718-3726 ◽  
Author(s):  
Peifeng Ruan ◽  
Ya Wang ◽  
Ronglai Shen ◽  
Shuang Wang

Abstract Motivation Recent technology developments have made it possible to generate various kinds of omics data, which provides opportunities to better solve problems such as disease subtyping or disease mapping using more comprehensive omics data jointly. Among many developed data-integration methods, the similarity network fusion (SNF) method has shown a great potential to identify new disease subtypes through separating similar subjects using multi-omics data. SNF effectively fuses similarity networks with pairwise patient similarity measures from different types of omics data into one fused network using both shared and complementary information across multiple types of omics data. Results In this article, we proposed an association-signal-annotation boosted similarity network fusion (ab-SNF) method, adding feature-level association signal annotations as weights aiming to up-weight signal features and down-weight noise features when constructing subject similarity networks to boost the performance in disease subtyping. In various simulation studies, the proposed ab-SNF outperforms the original SNF approach without weights. Most importantly, the improvement in the subtyping performance due to association-signal-annotation weights is amplified in the integration process. Applications to somatic mutation data, DNA methylation data and gene expression data of three cancer types from The Cancer Genome Atlas project suggest that the proposed ab-SNF method consistently identifies new subtypes in each cancer that more accurately predict patient survival and are more biologically meaningful. Availability and implementation The R package abSNF is freely available for downloading from https://github.com/pfruan/abSNF. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Sara Pidò ◽  
Gaia Ceddia ◽  
Marco Masseroli

AbstractThe complexity of cancer has always been a huge issue in understanding the source of this disease. However, by appreciating its complexity, we can shed some light on crucial gene associations across and in specific cancer types. In this study, we develop a general framework to infer relevant gene biomarkers and their gene-to-gene associations using multiple gene co-expression networks for each cancer type. Specifically, we infer computationally and biologically interesting communities of genes from kidney renal clear cell carcinoma, liver hepatocellular carcinoma, and prostate adenocarcinoma data sets of The Cancer Genome Atlas (TCGA) database. The gene communities are extracted through a data-driven pipeline and then evaluated through both functional analyses and literature findings. Furthermore, we provide a computational validation of their relevance for each cancer type by comparing the performance of normal/cancer classification for our identified gene sets and other gene signatures, including the typically-used differentially expressed genes. The hallmark of this study is its approach based on gene co-expression networks from different similarity measures: using a combination of multiple gene networks and then fusing normal and cancer networks for each cancer type, we can have better insights on the overall structure of the cancer-type-specific network.


IEEE Access ◽  
2021 ◽  
Vol 9 ◽  
pp. 21642-21652
Author(s):  
Murtadha D. Hssayeni ◽  
Behnaz Ghoraani

Sign in / Sign up

Export Citation Format

Share Document