scholarly journals The presence or absence alone of miRNA isoforms (isomiRs) successfully discriminate amongst the 32 TCGA cancer types

2016 ◽  
Author(s):  
Aristeidis G. Telonis ◽  
Rogan Magee ◽  
Phillipe Loher ◽  
Inna Chervoneva ◽  
Eric Londin ◽  
...  

Previously, we demonstrated that miRNA isoforms (isomiRs) are constitutive and their expression profiles depend on tissue, tissue state, and disease subtype. We have now extended our isomiR studies to The Cancer Genome Atlas (TCGA) repository. Specifically, we studied whether isomiR profiles can distinguish amongst the 32 cancers. We analyzed 10,271 datasets from 32 cancers and found 7,466 isomiRs from 807 miRNA hairpin-arms to be expressed above threshold. Using the top 20% most abundant isomiRs, we built a classifier that relied on “binary” isomiR profiles: isomiRs were simply represented as ‘present’ or ‘absent’ and, unlike previous methods, all knowledge about their expression levels was ignored. The classifier could label tumor samples with an average sensitivity of 93% and a False Discovery Rate of 3%. Notably, its ability to classify well persisted even when we reduced the set of used features (=isomiRs) by a factor of 10. A counterintuitive finding of our analysis is that the isomiRs and miRNA loci with the highest ability to classify tumors arenotthe ones that have been attracting the most research attention in the miRNA field. Our results provide a framework in which to study cancer-type-specific isomiRs and explore their potential uses as cancer biomarkers

mSystems ◽  
2018 ◽  
Vol 3 (5) ◽  
Author(s):  
Sara R. Selitsky ◽  
David Marron ◽  
Lisle E. Mose ◽  
Joel S. Parker ◽  
Dirk P. Dittmer

ABSTRACTEpstein-Barr virus (EBV) is convincingly associated with gastric cancer, nasopharyngeal carcinoma, and certain lymphomas, but its role in other cancer types remains controversial. To test the hypothesis that there are additional cancer types with high prevalence of EBV, we determined EBV viral expression in all the Cancer Genome Atlas Project (TCGA) mRNA sequencing (mRNA-seq) samples (n= 10,396) from 32 different tumor types. We found that EBV was present in gastric adenocarcinoma and lymphoma, as expected, and was also present in >5% of samples in 10 additional tumor types. For most samples, EBV transcript levels were low, which suggests that EBV was likely present due to infected infiltrating B cells. In order to determine if there was a difference in the B-cell populations, we assembled B-cell receptors for each sample and found B-cell receptor abundance (P≤ 1.4 × 10−20) and diversity (P≤ 8.3 × 10−27) were significantly higher in EBV-positive samples. Moreover, diversity was independent of B-cell abundance, suggesting that the presence of EBV was associated with an increased and altered B-cell population.IMPORTANCEAround 20% of human cancers are associated with viruses. Epstein-Barr virus (EBV) contributes to gastric cancer, nasopharyngeal carcinoma, and certain lymphomas, but its role in other cancer types remains controversial. We assessed the prevalence of EBV in RNA-seq from 32 tumor types in the Cancer Genome Atlas Project (TCGA) and found EBV to be present in >5% of samples in 12 tumor types. EBV infects epithelial cells and B cells and in B cells causes proliferation. We hypothesized that the low expression of EBV in most of the tumor types was due to infiltration of B cells into the tumor. The increase in B-cell abundance and diversity in subjects where EBV was detected in the tumors strengthens this hypothesis. Overall, we found that EBV was associated with an increased and altered immune response. This result is not evidence of causality, but a potential novel biomarker for tumor immune status.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Bi Lin ◽  
Yangyang Pan ◽  
Dinglai Yu ◽  
Shengjie Dai ◽  
Hongwei Sun ◽  
...  

Background. Pancreatic cancer is one of the most malignant tumors of the digestive system, and its treatment has rarely progressed for the last two decades. Studies on m6A regulators for the past few years have seemingly provided a novel approach for malignant tumor therapy. m6A-related factors may be potential biomarkers and therapeutic targets. This research is focused on the gene characteristics and clinical values of m6A regulators in predicting prognosis in pancreatic cancer. Methods. In our study, we obtained gene expression profiles with copy number variation (CNV) data and clinical characteristic data of 186 patients with pancreatic cancer from The Cancer Genome Atlas (TCGA) portal. Then, we determined the alteration of m6a regulators and their correlation with clinicopathological features using the log-rank tests, Cox regression model, and chi-square test. Additionally, we validated the prognostic value of m6A regulators in the International Cancer Genome Consortium (ICGC). Results. The results suggested that pancreatic cancer patients with ALKBH5 CNV were associated with worse overall survival and disease-free survival than those with diploid genes. Additionally, upregulation of the writer gene ALKBH5 had a positive correlation with the activation of AKT pathways in the TCGA database. Conclusion. Our study not only demonstrated genetic characteristic changes of m6A-related genes in pancreatic cancer and found a strong relationship between the changes of ALKBH5 and poor prognosis but also provided a novel therapeutic target for pancreatic cancer therapy.


2019 ◽  
Author(s):  
Lin Li ◽  
Mengyuan Li ◽  
Xiaosheng Wang

AbstractMany studies have shown thatTP53mutations play a negative role in antitumor immunity. However, a few studies reported thatTP53mutations could promote antitumor immunity. To explain these contradictory findings, we analyzed five cancer cohorts from The Cancer Genome Atlas (TCGA) project. We found thatTP53-mutated cancers had significantly higher levels of antitumor immune signatures thanTP53-wildtype cancers in breast invasive carcinoma (BRCA) and lung adenocarcinoma (LUAD). In contrast,TP53-mutated cancers had significantly lower antitumor immune signature levels thanTP53-wildtype cancers in stomach adenocarcinoma (STAD), colon adenocarcinoma (COAD), and head and neck squamous cell carcinoma (HNSC). Moreover,TP53-mutated cancers likely had higher tumor mutation burden (TMB) and tumor aneuploidy level (TAL) thanTP53-wildtype cancers. However, the TMB differences were more marked betweenTP53-mutated andTP53-wildtype cancers than the TAL differences in BRCA and LUAD, and the TAL differences were more significant in STAD and COAD. Furthermore, we showed that TMB and TAL had a positive and a negative correlation with antitumor immunity and that TMB affected antitumor immunity more greatly than TAL did in BRCA and LUAD while TAL affected antitumor immunity more strongly than TMB in STAD and HNSC. These findings indicate that the distinct correlations betweenTP53mutations and antitumor immunity in different cancer types are a consequence of the joint effect of the altered TMB and TAL caused byTP53mutations on tumor immunity. Our data suggest that theTP53mutation status could be a useful biomarker for cancer immunotherapy response depending on cancer types.


2017 ◽  
Author(s):  
Zhuyi Xue ◽  
René L Warren ◽  
Ewan A Gibb ◽  
Daniel MacMillan ◽  
Johnathan Wong ◽  
...  

AbstractAlternative polyadenylation (APA) of 3’ untranslated regions (3’ UTRs) has been implicated in cancer development. Earlier reports on APA in cancer primarily focused on 3’ UTR length modifications, and the conventional wisdom is that tumor cells preferentially express transcripts with shorter 3’ UTRs. Here, we analyzed the APA patterns of 114 genes, a select list of oncogenes and tumor suppressors, in 9,939 tumor and 729 normal tissue samples across 33 cancer types using RNA-Seq data from The Cancer Genome Atlas, and we found that the APA regulation machinery is much more complicated than what was previously thought. We report 77 cases (gene-cancer type pairs) of differential 3’ UTR cleavage patterns between normal and tumor tissues, involving 33 genes in 13 cancer types. For 15 genes, the tumor-specific cleavage patterns are recurrent across multiple cancer types. While the cleavage patterns in certain genes indicate apparent trends of 3’ UTR shortening in tumor samples, over half of the 77 cases imply 3’ UTR length change trends in cancer that are more complex than simple shortening or lengthening. This work extends the current understanding of APA regulation in cancer, and demonstrates how large volumes of RNA-seq data generated for characterizing cancer cohorts can be mined to investigate this process.


Genes ◽  
2019 ◽  
Vol 10 (8) ◽  
pp. 604 ◽  
Author(s):  
Wang ◽  
Wu ◽  
Ma

Prognosis modeling plays an important role in cancer studies. With the development of omics profiling, extensive research has been conducted to search for prognostic markers for various cancer types. However, many of the existing studies share a common limitation by only focusing on a single cancer type and suffering from a lack of sufficient information. With potential molecular similarity across cancer types, one cancer type may contain information useful for the analysis of other types. The integration of multiple cancer types may facilitate information borrowing so as to more comprehensively and more accurately describe prognosis. In this study, we conduct marginal and joint integrative analysis of multiple cancer types, effectively introducing integration in the discovery process. For accommodating high dimensionality and identifying relevant markers, we adopt the advanced penalization technique which has a solid statistical ground. Gene expression data on nine cancer types from The Cancer Genome Atlas (TCGA) are analyzed, leading to biologically sensible findings that are different from the alternatives. Overall, this study provides a novel venue for cancer prognosis modeling by integrating multiple cancer types.


2014 ◽  
Vol 36 (4) ◽  
pp. E23 ◽  
Author(s):  
David D. Gonda ◽  
Vincent J. Cheung ◽  
Karra A. Muller ◽  
Amit Goyal ◽  
Bob S. Carter ◽  
...  

Differentiating between low-grade gliomas (LGGs) of astrocytic and oligodendroglial origin remains a major challenge in neurooncology. Here the authors analyzed The Cancer Genome Atlas (TCGA) profiles of LGGs with the goal of identifying distinct molecular characteristics that would afford accurate and reliable discrimination of astrocytic and oligodendroglial tumors. They found that 1) oligodendrogliomas are more likely to exhibit the glioma-CpG island methylator phenotype (G-CIMP), relative to low-grade astrocytomas; 2) relative to oligodendrogliomas, low-grade astrocytomas exhibit a higher expression of genes related to mitosis, replication, and inflammation; and 3) low-grade astrocytic tumors harbor microRNA profiles similar to those previously described for glioblastoma tumors. Orthogonal intersection of these molecular characteristics with existing molecular markers, such as IDH1 mutation, TP53 mutation, and 1p19q status, should facilitate accurate and reliable pathological diagnosis of LGGs.


2014 ◽  
Vol 13s2 ◽  
pp. CIN.S13776
Author(s):  
Yanxun Xu ◽  
Yitan Zhu ◽  
Peter Müller ◽  
Riten Mitra ◽  
Yuan Ji

The Cancer Genome Atlas (TCGA) generates comprehensive genomic data for thousands of patients over more than 20 cancer types. TCGA data are typically whole-genome measurements of multiple genomic features, such as DNA copy numbers, DNA methylation, and gene expression, providing unique opportunities for investigating cancer mechanism from multiple molecular and regulatory layers. We propose a Bayesian graphical model to systemically integrate multi-platform TCGA data for inference of the interactions between different genomic features either within a gene or between multiple genes. The presence or absence of edges in the graph indicates the presence or absence of conditional dependence between genomic features. The inference is restricted to genes within a known biological network, but can be extended to any sets of genes. Applying the model to the same genes using patient samples in two different cancer types, we identify network components that are common as well as different between cancer types. The examples and codes are available at https://www.ma.utexas.edu/users/yxu/software.html .


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Milad Mostavi ◽  
Yu-Chiao Chiu ◽  
Yidong Chen ◽  
Yufei Huang

Abstract Background The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider how to utilize the existing training samples to predict cancer types unseen during the training. We hypothesize the existence of a set of type-agnostic expression representations that define the similarity/dissimilarity between samples of the same/different types and propose a novel one-shot learning model called CancerSiamese to learn this common representation. CancerSiamese accepts a pair of query and support samples (gene expression profiles) and learns the representation of similar or dissimilar cancer types through two parallel convolutional neural networks joined by a similarity function. Results We trained CancerSiamese for cancer type prediction for primary and metastatic tumors using samples from the Cancer Genome Atlas (TCGA) and MET500. Network transfer learning was utilized to facilitate the training of the CancerSiamese models. CancerSiamese was tested for different N-way predictions and yielded an average accuracy improvement of 8% and 4% over the benchmark 1-Nearest Neighbor (1-NN) classifier for primary and metastatic tumors, respectively. Moreover, we applied the guided gradient saliency map and feature selection to CancerSiamese to examine 100 and 200 top marker-gene candidates for the prediction of primary and metastatic cancers, respectively. Functional analysis of these marker genes revealed several cancer related functions between primary and metastatic tumors. Conclusion This work demonstrated, for the first time, the feasibility of predicting unseen cancer types whose samples are limited. Thus, it could inspire new and ingenious applications of one-shot and few-shot learning solutions for improving cancer diagnosis, prognostic, and our understanding of cancer.


2020 ◽  
Author(s):  
Qian Ke ◽  
Wikum Dinalankara ◽  
Laurent Younes ◽  
Donald Geman ◽  
Luigi Marchionni

AbstractCancer cells display massive dysregulation of key regulatory pathways due to now well-catalogued mutations and other DNA-related aberrations. Moreover, enormous heterogeneity has been commonly observed in the identity, frequency and location of these aberrations across individuals with the same cancer type or subtype, and this variation naturally propagates to the transcriptome, resulting in myriad types of dysregulated gene expression programs. Many have argued that a more integrative and quantitative analysis of heterogeneity of DNA and RNA molecular profiles may be necessary for designing more systematic explorations of alternative therapies and improving predictive accuracy.We introduce a representation of multi-omics profiles which is sufficiently rich to account for observed heterogeneity and support the construction of quantitative, integrated, metrics of variation. Starting from the network of interactions existing in Reactome, we build a library of “paired DNA-RNA aberrations” that represent prototypical and recurrent patterns of dysregulation in cancer; each two-gene “Source-Target Pair” (STP) consists of a “source” regulatory gene and a “target” gene whose expression is plausibly “controlled” by the source gene. The STP is then “aberrant” in a joint DNA-RNA profile if the source gene is DNA-aberrant (e.g., mutated, deleted, or duplicated), and the downstream target gene is “RNA-aberrant”, meaning its expression level is outside the normal, baseline range. With M STPs, each sample profile has exactly one of the 2M possible configurations.We concentrate on subsets of STPs, and the corresponding reduced configurations, by selecting tissue-dependent minimal coverings, defined as the smallest family of STPs with the property that every sample in the considered population displays at least one aberrant STP within that family. These minimal coverings can be computed with integer programming. Given such a covering, a natural measure of cross-sample diversity is the extent to which the particular aberrant STPs composing a covering vary from sample to sample; this variability is captured by the entropy of the distribution over configurations.We apply this program to data from TCGA for six distinct tumor types (breast, prostate, lung, colon, liver, and kidney cancer). This enables an efficient simplification of the complex landscape observed in cancer populations, resulting in the identification of novel signatures of molecular alterations which are not detected with frequency-based criteria. Estimates of cancer heterogeneity across tumor phenotypes reveals a stable pattern: entropy increases with disease severity. This framework is then well-suited to accommodate the expanding complexity of cancer genomes and epigenomes emerging from large consortia projects.Author SummaryA large variety of genomic and transcriptomic aberrations are observed in cancer cells, and their identity, location, and frequency can be highly indicative of the particular subtype or molecular phenotype, and thereby inform treatment options. However, elucidating this association between sets of aberrations and subtypes of cancer is severely impeded by considerable diversity in the set of aberrations across samples from the same population. Most attempts at analyzing tumor heterogeneity have dealt with either the genome or transcriptome in isolation. Here we present a novel, multi-omics approach for quantifying heterogeneity by determining a small set of paired DNA-RNA aberrations that incorporates potential downstream effects on gene expression. We apply integer programming to identify a small set of paired aberrations such that at least one among them is present in every sample of a given cancer population. The resulting “coverings” are analyzed for six cancer cohorts from the Cancer Genome Atlas, and facilitate introducing an information-theoretic measure of heterogeneity. Our results identify many known facets of tumorigenesis as well as suggest potential novel genes and interactions of interest.Data Availability StatementRNA-Seq data, somatic mutation data and copy number data for The Cancer Genome Atlas were obtained through the Xena Cancer Genome Browser database (https://xenabrowser.net) from individual cancer type cohorts. Processed data in the form of TAB delimited files, and selected tissue-level coverings (in excel format) are provided as additional supplementary material and are also available from the Marchionni laboratory website (www.marchionnilab.org/signatures.html)


Sign in / Sign up

Export Citation Format

Share Document