scholarly journals CancerSiamese: one-shot learning for predicting primary and metastatic tumor types unseen during model training

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Milad Mostavi ◽  
Yu-Chiao Chiu ◽  
Yidong Chen ◽  
Yufei Huang

Abstract Background The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider how to utilize the existing training samples to predict cancer types unseen during the training. We hypothesize the existence of a set of type-agnostic expression representations that define the similarity/dissimilarity between samples of the same/different types and propose a novel one-shot learning model called CancerSiamese to learn this common representation. CancerSiamese accepts a pair of query and support samples (gene expression profiles) and learns the representation of similar or dissimilar cancer types through two parallel convolutional neural networks joined by a similarity function. Results We trained CancerSiamese for cancer type prediction for primary and metastatic tumors using samples from the Cancer Genome Atlas (TCGA) and MET500. Network transfer learning was utilized to facilitate the training of the CancerSiamese models. CancerSiamese was tested for different N-way predictions and yielded an average accuracy improvement of 8% and 4% over the benchmark 1-Nearest Neighbor (1-NN) classifier for primary and metastatic tumors, respectively. Moreover, we applied the guided gradient saliency map and feature selection to CancerSiamese to examine 100 and 200 top marker-gene candidates for the prediction of primary and metastatic cancers, respectively. Functional analysis of these marker genes revealed several cancer related functions between primary and metastatic tumors. Conclusion This work demonstrated, for the first time, the feasibility of predicting unseen cancer types whose samples are limited. Thus, it could inspire new and ingenious applications of one-shot and few-shot learning solutions for improving cancer diagnosis, prognostic, and our understanding of cancer.

2020 ◽  
Author(s):  
Milad Mostav ◽  
Yu-Chiao Chiu ◽  
Yidong Chen ◽  
Yufei Huang

AbstractWe consider cancer classification based on one single gene expression profile. We proposed CancerSiamese, a new one-shot learning model, to predict the cancer type of a query primary or metastatic tumor sample based on a support set that contains only one known sample for each cancer type. CancerSiamese receives pairs of gene expression profiles and learns a representation of similar or dissimilar cancer types through two parallel Convolutional Neural Networks joined by a similarity function. We trained CancerSiamese for both primary and metastatic cancer type predictions using samples from TCGA and MET500. Test results for different N-way predictions yielded an average accuracy improvement of 8% and 4% over the benchmark 1-Nearest Neighbor (1-NN) classifier for primary and metastatic tumors, respectively. Moreover, we applied the guided gradient saliency map and feature selection to CancerSiamese to identify and analyze the marker-gene candidates for primary and metastatic cancers. Our work demonstrated, for the first time, the feasibility of applying one-shot learning for expression-based cancer type prediction when gene expression data of cancer types are limited and could inspire new and ingenious applications of one-shot and few-shot learning solutions for improving cancer diagnosis, treatment planning, and our understanding of cancer.


2016 ◽  
Author(s):  
Aristeidis G. Telonis ◽  
Rogan Magee ◽  
Phillipe Loher ◽  
Inna Chervoneva ◽  
Eric Londin ◽  
...  

Previously, we demonstrated that miRNA isoforms (isomiRs) are constitutive and their expression profiles depend on tissue, tissue state, and disease subtype. We have now extended our isomiR studies to The Cancer Genome Atlas (TCGA) repository. Specifically, we studied whether isomiR profiles can distinguish amongst the 32 cancers. We analyzed 10,271 datasets from 32 cancers and found 7,466 isomiRs from 807 miRNA hairpin-arms to be expressed above threshold. Using the top 20% most abundant isomiRs, we built a classifier that relied on “binary” isomiR profiles: isomiRs were simply represented as ‘present’ or ‘absent’ and, unlike previous methods, all knowledge about their expression levels was ignored. The classifier could label tumor samples with an average sensitivity of 93% and a False Discovery Rate of 3%. Notably, its ability to classify well persisted even when we reduced the set of used features (=isomiRs) by a factor of 10. A counterintuitive finding of our analysis is that the isomiRs and miRNA loci with the highest ability to classify tumors arenotthe ones that have been attracting the most research attention in the miRNA field. Our results provide a framework in which to study cancer-type-specific isomiRs and explore their potential uses as cancer biomarkers


2016 ◽  
Vol 15s2 ◽  
pp. CIN.S39367 ◽  
Author(s):  
Seyedsasan Hashemikhabir ◽  
Gungor Budak ◽  
Sarath Chandra Janga

Survival analysis in biomedical sciences is generally performed by correlating the levels of cellular components with patients’ clinical features as a common practice in prognostic biomarker discovery. While the common and primary focus of such analysis in cancer genomics so far has been to identify the potential prognostic genes, alternative splicing – a posttranscriptional regulatory mechanism that affects the functional form of a protein due to inclusion or exclusion of individual exons giving rise to alternative protein products, has increasingly gained attention due to the prevalence of splicing aberrations in cancer transcriptomes. Hence, uncovering the potential prognostic exons can not only help in rationally designing exon-specific therapeutics but also increase specificity toward more personalized treatment options. To address this gap and to provide a platform for rational identification of prognostic exons from cancer transcriptomes, we developed ExSurv ( https://exsurv.soic.iupui.edu ), a web-based platform for predicting the survival contribution of all annotated exons in the human genome using RNA sequencing-based expression profiles for cancer samples from four cancer types available from The Cancer Genome Atlas. ExSurv enables users to search for a gene of interest and shows survival probabilities for all the exons associated with a gene and found to be significant at the chosen threshold. ExSurv also includes raw expression values across the cancer cohort as well as the survival plots for prognostic exons. Our analysis of the resulting prognostic exons across four cancer types revealed that most of the survival-associated exons are unique to a cancer type with few processes such as cell adhesion, carboxylic, fatty acid metabolism, and regulation of T-cell signaling common across cancer types, possibly suggesting significant differences in the posttranscriptional regulatory pathways contributing to prognosis.


2021 ◽  
Author(s):  
H. Robert Frost

AbstractThe genetic alterations that underlie cancer development are highly tissue-specific with the majority of driving alterations occurring in only a few cancer types and with alterations common to multiple cancer types often showing a tissue-specific functional impact. This tissue-specificity means that the biology of normal tissues carries important information regarding the pathophysiology of the associated cancers, information that can be leveraged to improve the power and accuracy of cancer genomic analyses. Research exploring the use of normal tissue data for the analysis of cancer genomics has primarily focused on the paired analysis of tumor and adjacent normal samples. Efforts to leverage the general characteristics of normal tissue for cancer analysis has received less attention with most investigations focusing on understanding the tissue-specific factors that lead to individual genomic alterations or dysregulated pathways within a single cancer type. To address this gap and support scenarios where adjacent normal tissue samples are not available, we explored the genome-wide association between the transcriptomes of 21 solid human cancers and their associated normal tissues as profiled in healthy individuals. While the average gene expression profiles of normal and cancerous tissue may appear distinct, with normal tissues more similar to other normal tissues than to the associated cancer types, when transformed into relative expression values, i.e., the ratio of expression in one tissue or cancer relative to the mean in other tissues or cancers, the close association between gene activity in normal tissues and related cancers is revealed. As we demonstrate through an analysis of tumor data from The Cancer Genome Atlas and normal tissue data from the Human Protein Atlas, this association between tissue-specific and cancer-specific expression values can be leveraged to improve the prognostic modeling of cancer, the comparative analysis of different cancer types, and the analysis of cancer and normal tissue pairs.


2021 ◽  
Vol 11 ◽  
Author(s):  
Wencheng Zhang ◽  
Zhouyong Gao ◽  
Mingxiu Guan ◽  
Ning Liu ◽  
Fanjie Meng ◽  
...  

Anti-silencing function 1B histone chaperone (ASF1B) is known to be an important modulator of oncogenic processes, yet its role in lung adenocarcinoma (LUAD) remains to be defined. In this study, an integrated assessment of The Cancer Genome Atlas (TCGA) and genotype-tissue expression (GTEx) datasets revealed the overexpression of ASF1B in all analyzed cancer types other than LAML. Genetic, epigenetic, microsatellite instability (MSI), and tumor mutational burden (TMB) analysis showed that ASF1B was regulated by single or multiple factors. Kaplan-Meier survival curves suggested that elevated ASF1B expression was associated with better or worse survival in a cancer type-dependent manner. The CIBERSORT algorithm was used to evaluate immune microenvironment composition, and distinct correlations between ASF1B expression and immune cell infiltration were evident when comparing tumor and normal tissue samples. Gene set enrichment analysis (GSEA) indicated that ASF1B was associated with proliferation- and immunity-related pathways. Knocking down ASF1B impaired the proliferation, affected cell cycle distribution, and induced cell apoptosis in LUAD cell lines. In contrast, ASF1B overexpression had no impact on the malignant characteristics of LUAD cells. At the mechanistic level, ASF1B served as an indirect regulator of DNA Polymerase Epsilon 3, Accessory Subunit (POLE3), CDC28 protein kinase regulatory subunit 1(CKS1B), Dihydrofolate reductase (DHFR), as established through proteomic profiling and Immunoprecipitation-Mass Spectrometry (IP-MS) analyses. Overall, these data suggested that ASF1B serves as a tumor promoter and potential target for cancer therapy and provided us with clues to better understand the importance of ASF1B in many types of cancer.


NAR Cancer ◽  
2020 ◽  
Vol 2 (1) ◽  
Author(s):  
Julianne K David ◽  
Sean K Maden ◽  
Benjamin R Weeder ◽  
Reid F Thompson ◽  
Abhinav Nellore

Abstract This study probes the distribution of putatively cancer-specific junctions across a broad set of publicly available non-cancer human RNA sequencing (RNA-seq) datasets. We compared cancer and non-cancer RNA-seq data from The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression (GTEx) Project and the Sequence Read Archive. We found that (i) averaging across cancer types, 80.6% of exon–exon junctions thought to be cancer-specific based on comparison with tissue-matched samples (σ = 13.0%) are in fact present in other adult non-cancer tissues throughout the body; (ii) 30.8% of junctions not present in any GTEx or TCGA normal tissues are shared by multiple samples within at least one cancer type cohort, and 87.4% of these distinguish between different cancer types; and (iii) many of these junctions not found in GTEx or TCGA normal tissues (15.4% on average, σ = 2.4%) are also found in embryological and other developmentally associated cells. These findings refine the meaning of RNA splicing event novelty, particularly with respect to the human neoepitope repertoire. Ultimately, cancer-specific exon–exon junctions may have a substantial causal relationship with the biology of disease.


2017 ◽  
Vol 3 (1) ◽  
pp. 53 ◽  
Author(s):  
Tomoya Mori ◽  
Junko Yamane ◽  
Kenta Kobayashi ◽  
Nobuko Taniyama ◽  
Takanori Tano ◽  
...  

In silico three-dimensional (3D) reconstruction of tissues/organs based on single-cell profiles is required to comprehensively understand how individual cells are organized in actual tissues/organs. Although several tissue reconstruction methods have been developed, they are still insufficient to map cells on the original tissues in terms of both scale and quality. In this study, we aim to develop a novel informatics approach which can reconstruct whole and various tissues/organs in silico. As the first step of this project, we conducted single-cell transcriptome analysis of 38 individual cells obtained from two mouse blastocysts (E3.5d) and tried to reconstruct blastocyst structures in 3D. In reconstruction step, each cell position is estimated by 3D principal component analysis and expression profiles of cell adhesion genes as well as other marker genes. In addition, we also proposed a reconstruction method without using marker gene information. The resulting reconstructed blastocyst structures implied an indirect relationship between the genes of Myh9 and Oct4.


2019 ◽  
Author(s):  
Lin Li ◽  
Mengyuan Li ◽  
Xiaosheng Wang

AbstractMany studies have shown thatTP53mutations play a negative role in antitumor immunity. However, a few studies reported thatTP53mutations could promote antitumor immunity. To explain these contradictory findings, we analyzed five cancer cohorts from The Cancer Genome Atlas (TCGA) project. We found thatTP53-mutated cancers had significantly higher levels of antitumor immune signatures thanTP53-wildtype cancers in breast invasive carcinoma (BRCA) and lung adenocarcinoma (LUAD). In contrast,TP53-mutated cancers had significantly lower antitumor immune signature levels thanTP53-wildtype cancers in stomach adenocarcinoma (STAD), colon adenocarcinoma (COAD), and head and neck squamous cell carcinoma (HNSC). Moreover,TP53-mutated cancers likely had higher tumor mutation burden (TMB) and tumor aneuploidy level (TAL) thanTP53-wildtype cancers. However, the TMB differences were more marked betweenTP53-mutated andTP53-wildtype cancers than the TAL differences in BRCA and LUAD, and the TAL differences were more significant in STAD and COAD. Furthermore, we showed that TMB and TAL had a positive and a negative correlation with antitumor immunity and that TMB affected antitumor immunity more greatly than TAL did in BRCA and LUAD while TAL affected antitumor immunity more strongly than TMB in STAD and HNSC. These findings indicate that the distinct correlations betweenTP53mutations and antitumor immunity in different cancer types are a consequence of the joint effect of the altered TMB and TAL caused byTP53mutations on tumor immunity. Our data suggest that theTP53mutation status could be a useful biomarker for cancer immunotherapy response depending on cancer types.


2017 ◽  
Author(s):  
Zhuyi Xue ◽  
René L Warren ◽  
Ewan A Gibb ◽  
Daniel MacMillan ◽  
Johnathan Wong ◽  
...  

AbstractAlternative polyadenylation (APA) of 3’ untranslated regions (3’ UTRs) has been implicated in cancer development. Earlier reports on APA in cancer primarily focused on 3’ UTR length modifications, and the conventional wisdom is that tumor cells preferentially express transcripts with shorter 3’ UTRs. Here, we analyzed the APA patterns of 114 genes, a select list of oncogenes and tumor suppressors, in 9,939 tumor and 729 normal tissue samples across 33 cancer types using RNA-Seq data from The Cancer Genome Atlas, and we found that the APA regulation machinery is much more complicated than what was previously thought. We report 77 cases (gene-cancer type pairs) of differential 3’ UTR cleavage patterns between normal and tumor tissues, involving 33 genes in 13 cancer types. For 15 genes, the tumor-specific cleavage patterns are recurrent across multiple cancer types. While the cleavage patterns in certain genes indicate apparent trends of 3’ UTR shortening in tumor samples, over half of the 77 cases imply 3’ UTR length change trends in cancer that are more complex than simple shortening or lengthening. This work extends the current understanding of APA regulation in cancer, and demonstrates how large volumes of RNA-seq data generated for characterizing cancer cohorts can be mined to investigate this process.


2016 ◽  
Author(s):  
Roni Rasnic ◽  
Nathan Linial ◽  
Michal Linial

ABSTRACTThe primary function of microRNAs (miRNAs) is to maintain cell homeostasis. In cancerous tissues miRNAs’ expression undergo drastic alterations. In this study, we used miRNA expression profiles from The Cancer Genome Atlas (TCGA) of 24 cancer types and 3 healthy tissues, collected from >8500 samples. We seek to classify the cancer’s origin and tissue identification using the expression from 1046 reported miRNAs. Despite an apparent uniform appearance of miRNAs among cancerous samples, we recover indispensable information from lowly expressed miRNAs regarding the cancer/tissue types. Multiclass support vector machine classification yields an average recall of 58% in identifying the correct tissue and tumor types. Data discretization has led to substantial improvement reaching an average recall of 91% (95% median). We propose a straightforward protocol as a crucial step in classifying tumors of unknown primary origin. Our counter-intuitive conclusion is that in almost all cancer types, highly expressing miRNAs mask the significant signal that lower expressed miRNAs provide.


Sign in / Sign up

Export Citation Format

Share Document