scholarly journals Characterizing Cancer-Specific Networks by Integrating TCGA Data

2014 ◽  
Vol 13s2 ◽  
pp. CIN.S13776
Author(s):  
Yanxun Xu ◽  
Yitan Zhu ◽  
Peter Müller ◽  
Riten Mitra ◽  
Yuan Ji

The Cancer Genome Atlas (TCGA) generates comprehensive genomic data for thousands of patients over more than 20 cancer types. TCGA data are typically whole-genome measurements of multiple genomic features, such as DNA copy numbers, DNA methylation, and gene expression, providing unique opportunities for investigating cancer mechanism from multiple molecular and regulatory layers. We propose a Bayesian graphical model to systemically integrate multi-platform TCGA data for inference of the interactions between different genomic features either within a gene or between multiple genes. The presence or absence of edges in the graph indicates the presence or absence of conditional dependence between genomic features. The inference is restricted to genes within a known biological network, but can be extended to any sets of genes. Applying the model to the same genes using patient samples in two different cancer types, we identify network components that are common as well as different between cancer types. The examples and codes are available at https://www.ma.utexas.edu/users/yxu/software.html .

mSystems ◽  
2018 ◽  
Vol 3 (5) ◽  
Author(s):  
Sara R. Selitsky ◽  
David Marron ◽  
Lisle E. Mose ◽  
Joel S. Parker ◽  
Dirk P. Dittmer

ABSTRACTEpstein-Barr virus (EBV) is convincingly associated with gastric cancer, nasopharyngeal carcinoma, and certain lymphomas, but its role in other cancer types remains controversial. To test the hypothesis that there are additional cancer types with high prevalence of EBV, we determined EBV viral expression in all the Cancer Genome Atlas Project (TCGA) mRNA sequencing (mRNA-seq) samples (n= 10,396) from 32 different tumor types. We found that EBV was present in gastric adenocarcinoma and lymphoma, as expected, and was also present in >5% of samples in 10 additional tumor types. For most samples, EBV transcript levels were low, which suggests that EBV was likely present due to infected infiltrating B cells. In order to determine if there was a difference in the B-cell populations, we assembled B-cell receptors for each sample and found B-cell receptor abundance (P≤ 1.4 × 10−20) and diversity (P≤ 8.3 × 10−27) were significantly higher in EBV-positive samples. Moreover, diversity was independent of B-cell abundance, suggesting that the presence of EBV was associated with an increased and altered B-cell population.IMPORTANCEAround 20% of human cancers are associated with viruses. Epstein-Barr virus (EBV) contributes to gastric cancer, nasopharyngeal carcinoma, and certain lymphomas, but its role in other cancer types remains controversial. We assessed the prevalence of EBV in RNA-seq from 32 tumor types in the Cancer Genome Atlas Project (TCGA) and found EBV to be present in >5% of samples in 12 tumor types. EBV infects epithelial cells and B cells and in B cells causes proliferation. We hypothesized that the low expression of EBV in most of the tumor types was due to infiltration of B cells into the tumor. The increase in B-cell abundance and diversity in subjects where EBV was detected in the tumors strengthens this hypothesis. Overall, we found that EBV was associated with an increased and altered immune response. This result is not evidence of causality, but a potential novel biomarker for tumor immune status.


2017 ◽  
pp. 1-13 ◽  
Author(s):  
Anshuman Panda ◽  
Anil Betigeri ◽  
Kalyanasundaram Subramanian ◽  
Jeffrey S. Ross ◽  
Dean C. Pavlick ◽  
...  

Purpose An association between mutational burden and response to immune checkpoint therapy has been documented in several cancer types. The potential for such a mutational burden threshold to predict response to immune checkpoint therapy was evaluated in several clinical datasets, where mutational burden was measured either by whole-exome sequencing or by using commercially available sequencing panels. Methods Whole-exome sequencing and RNA sequencing data of 33 solid cancer types from The Cancer Genome Atlas were analyzed to determine whether a robust immune checkpoint–activating mutation (iCAM) burden threshold associated with evidence of immune checkpoint activation exists in these cancers that may serve as a biomarker of response to immune checkpoint blockade therapy. Results We found that a robust iCAM threshold, associated with signatures of immune checkpoint activation, exists in eight of 33 solid cancers: melanoma, lung adenocarcinoma, colon adenocarcinoma, endometrial cancer, stomach adenocarcinoma, cervical cancer, estrogen receptor–positive/human epidermal growth factor receptor 2–negative breast cancer, and bladder-urothelial cancer. Tumors with a mutational burden higher than the threshold (iCAM positive) also had clear histologic evidence of lymphocytic infiltration. In published datasets of melanoma, lung adenocarcinoma, and colon cancer, patients with iCAM-positive tumors had significantly better response to immune checkpoint therapy compared with those with iCAM-negative tumors. Receiver operating characteristic analysis using The Cancer Genome Atlas predictions as the gold standard showed that iCAM-positive tumors are accurately identifiable using clinical sequencing assays, such as FoundationOne (Foundation Medicine, Cambridge, MA) or StrandAdvantage (Strand Life Sciences, Bangalore, India). Using the FoundationOne-derived threshold, an analysis of 113 melanoma tumors showed that patients with iCAM-positive disease have significantly better response to immune checkpoint therapy. iCAM-positive and iCAM-negative tumors have distinct mutation patterns and different immune microenvironments. Conclusion In eight solid cancers, a mutational burden threshold exists that may predict response to immune checkpoint blockade. This threshold is identifiable using available clinical sequencing assays.


2016 ◽  
Author(s):  
Alexandra R. Buckley ◽  
Kristopher A. Standish ◽  
Kunal Bhutani ◽  
Trey Ideker ◽  
Hannah Carter ◽  
...  

AbstractThe degree to which germline variation drives cancer development and shapes tumor phenotypes remains largely unexplored, possibly due to a lack of large scale publicly available germline data for a cancer cohort. Here we called germline variants on 9,618 cases from The Cancer Genome Atlas (TCGA) database representing 31 cancer types. We identified batch effects affecting loss of function (LOF) variant calls that can be traced back to differences in the way the sequence data were generated both within and across cancer types. Overall, LOF indel calls were more sensitive to technical artifacts than LOF Single Nucleotide Variant (SNV) calls. In particular, whole genome amplification of DNA prior to sequencing led to an artificially increased burden of LOF indel calls, which confounded association analyses relating germline variants to tumor type despite stringent indel filtering strategies. Due to the inherent noise we chose to remove all 614 amplified DNA samples, including all acute myeloid leukemia and virtually all ovarian cancer samples, from the final dataset. This study demonstrates how insufficient quality control can lead to false positive germlinetumor type associations and draws attention to the need to be sensitive to problems associated with a lack of uniformity in data generation in TCGA data.Author SummaryCancer research to date has largely focused on genetic aberrations specific to tumor tissue. In contrast, the degree to which germline, or inherited, variation contributes to tumorigenesis remains unclear, possibly due to a lack of accessible germline variant data. In this study we identify germline variants in 9,618 samples using raw germline exome data from The Cancer Genome Atlas (TCGA). There are substantial differences in the way exome sequence data was generated both across and within cancer types in TCGA. We observe that differences in sequence data generation introduced batch effects, or variation that is due to technical factors not true biological variation, in our variant data. Most notably, we observe that amplification of DNA prior to sequencing resulted in an excess of predicted damaging indel variants. We show how these batch effects can confound germline association analyses if not properly addressed. Our study highlights the difficulties of working with large public genomic datasets like TCGA where samples are collected over time and across data centers, and particularly cautions the use of amplified DNA samples for genetic association analyses.


2018 ◽  
Author(s):  
Jake Lever ◽  
Eric Y. Zhao ◽  
Jasleen Grewal ◽  
Martin R. Jones ◽  
Steven J. M. Jones

AbstractUnderstanding a mutation in cancer requires knowledge of the different roles that genes play in cancer as drivers, oncogenes and tumor suppressors. We present CancerMine, a high-quality text-mined knowledgebase that catalogues over 856 genes as drivers, 2,421 as oncogenes and 2,037 as tumor suppressors in 426 cancer types. We compile 3,485 genes that are not in the IntOGen resource of drivers and complement the Cancer Gene Census with 3,136 new genes identified as oncogenes and tumor suppressors. CancerMine provides a method for gene-centric clustering of cancer types illustrating genetic similarities between cancer types of different organs and was validated against data from the Cancer Genome Atlas (TCGA) project. Finally with 178 novel cancer gene mentions in publications each month, this resource will be updated monthly, pre-empting the need to manually curate the ever-increasing number of novel cancer associated genes. CancerMine is viewable through a web portal (http://bionlp.bcgsc.ca/cancermine/) and available for download (https://github.com/jakelever/cancermine).


2017 ◽  
Author(s):  
Xin Hu ◽  
Qianghu Wang ◽  
Floris Barthel ◽  
Ming Tang ◽  
Samirkumar Amin ◽  
...  

Fusion genes, particularly those involving kinases, have been demonstrated as drivers and are frequent therapeutic targets in cancer1. Here, we describe our results on detecting transcript fusions across 33 cancer types from The Cancer Genome Atlas (TCGA), totaling 9,966 cancer samples and 648 normal samples2. Preprocessing, including read alignment to both genome and transcriptome, and fusion detection were carried out using a uniform pipeline3. To validate the resultant fusions, we also called somatic structural variations for 561 cancers from whole genome sequencing data. A summary of the data used in this study is provided in Table S1. Our results can be accessed per our portal at http://www.tumorfusions.org.


2019 ◽  
pp. 1-9 ◽  
Author(s):  
Yitan Zhu ◽  
Abdallah S.R. Mohamed ◽  
Stephen Y. Lai ◽  
Shengjie Yang ◽  
Aasheesh Kanwar ◽  
...  

Purpose Recent data suggest that imaging radiomic features of a tumor could be indicative of important genomic biomarkers. Understanding the relationship between radiomic and genomic features is important for basic cancer research and future patient care. We performed a comprehensive study to discover the imaginggenomic associations in head and neck squamous cell carcinoma (HNSCC) and explore the potential of predicting tumor genomic alternations using radiomic features. Methods Our retrospective study integrated whole-genome multiomics data from The Cancer Genome Atlas with matched computed tomography imaging data from The Cancer Imaging Archive for the same set of 126 patients with HNSCC. Linear regression and gene set enrichment analysis were used to identify statistically significant associations between radiomic imaging and genomic features. Random forest classifier was used to predict the status of two key HNSCC molecular biomarkers, human papillomavirus and disruptive TP53 mutation, on the basis of radiomic features. Results Widespread and statistically significant associations were discovered between genomic features (including microRNA expression, somatic mutations, and transcriptional activity, copy number variations, and promoter region DNA methylation changes of pathways) and radiomic features characterizing the size, shape, and texture of tumor. Prediction of human papillomavirus and TP53 mutation status using radiomic features achieved areas under the receiver operating characteristic curve of 0.71 and 0.641, respectively. Conclusion Our exploratory study suggests that radiomic features are associated with genomic characteristics at multiple molecular layers in HNSCC and provides justification for continued development of radiomics as biomarkers for relevant genomic alterations in HNSCC.


2021 ◽  
Vol 11 ◽  
Author(s):  
Yi-Hong Liu ◽  
Yu-Lian Chen ◽  
Ting-Yu Lai ◽  
Ying-Chieh Ko ◽  
Yu-Fu Chou ◽  
...  

BackgroundPartial epithelial-mesenchymal transition (p-EMT) is a distinct clinicopathological feature prevalent in oral cavity tumors of The Cancer Genome Atlas. Located at the invasion front, p-EMT cells require additional support from the tumor stroma for collective cell migration, including track clearing, extracellular matrix remodeling and immune evasion. The pathological roles of otherwise nonmalignant cancer-associated fibroblasts (CAFs) in cancer progression are emerging.MethodsGene set enrichment analysis was used to reveal differentially enriched genes and molecular pathways in OC3 and TW2.6 xenograft tissues, representing mesenchymal and p-EMT tumors, respectively. R packages of genomic data science were executed for statistical evaluations and data visualization. Immunohistochemistry and Alcian blue staining were conducted to validate the bioinformatic results. Univariate and multivariate Cox proportional hazards models were performed to identify covariates significantly associated with overall survival in clinical datasets. Kaplan–Meier curves of estimated overall survival were compared for statistical difference using the log-rank test.ResultsCompared to mesenchymal OC3 cells, tumor stroma derived from p-EMT TW2.6 cells was significantly enriched in microvessel density, tumor-excluded macrophages, inflammatory CAFs, and extracellular hyaluronan deposition. By translating these results to clinical transcriptomic datasets of oral cancer specimens, including the Puram single-cell RNA-seq cohort comprising ~6000 cells, we identified the expression of stromal TGFBI and HYAL1 as independent poor and protective biomarkers, respectively, for 40 Taiwanese oral cancer tissues that were all derived from betel quid users. In The Cancer Genome Atlas, TGFBI was a poor marker not only for head and neck cancer but also for additional six cancer types and HYAL1 was a good indicator for four tumor cohorts, suggesting common stromal effects existing in different cancer types.ConclusionsAs the tumor stroma coevolves with cancer progression, the cellular origins of molecular markers identified from conventional whole tissue mRNA-based analyses should be cautiously interpreted. By incorporating disease-matched xenograft tissue and single-cell RNA-seq results, we suggested that TGFBI and HYAL1, primarily expressed by stromal CAFs and endothelial cells, respectively, could serve as robust prognostic biomarkers for oral cancer control.


2021 ◽  
Vol 11 ◽  
Author(s):  
Luuk Harbers ◽  
Federico Agostini ◽  
Marcin Nicos ◽  
Dimitri Poddighe ◽  
Magda Bienko ◽  
...  

Somatic copy number alterations (SCNAs) are a pervasive trait of human cancers that contributes to tumorigenesis by affecting the dosage of multiple genes at the same time. In the past decade, The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) initiatives have generated and made publicly available SCNA genomic profiles from thousands of tumor samples across multiple cancer types. Here, we present a comprehensive analysis of 853,218 SCNAs across 10,729 tumor samples belonging to 32 cancer types using TCGA data. We then discuss current models for how SCNAs likely arise during carcinogenesis and how genomic SCNA profiles can inform clinical practice. Lastly, we highlight open questions in the field of cancer-associated SCNAs.


2016 ◽  
Author(s):  
Aristeidis G. Telonis ◽  
Rogan Magee ◽  
Phillipe Loher ◽  
Inna Chervoneva ◽  
Eric Londin ◽  
...  

Previously, we demonstrated that miRNA isoforms (isomiRs) are constitutive and their expression profiles depend on tissue, tissue state, and disease subtype. We have now extended our isomiR studies to The Cancer Genome Atlas (TCGA) repository. Specifically, we studied whether isomiR profiles can distinguish amongst the 32 cancers. We analyzed 10,271 datasets from 32 cancers and found 7,466 isomiRs from 807 miRNA hairpin-arms to be expressed above threshold. Using the top 20% most abundant isomiRs, we built a classifier that relied on “binary” isomiR profiles: isomiRs were simply represented as ‘present’ or ‘absent’ and, unlike previous methods, all knowledge about their expression levels was ignored. The classifier could label tumor samples with an average sensitivity of 93% and a False Discovery Rate of 3%. Notably, its ability to classify well persisted even when we reduced the set of used features (=isomiRs) by a factor of 10. A counterintuitive finding of our analysis is that the isomiRs and miRNA loci with the highest ability to classify tumors arenotthe ones that have been attracting the most research attention in the miRNA field. Our results provide a framework in which to study cancer-type-specific isomiRs and explore their potential uses as cancer biomarkers


2019 ◽  
pp. 1-11 ◽  
Author(s):  
Evan S. Smith ◽  
Arnaud Da Cruz Paula ◽  
Karen A. Cadoo ◽  
Nadeem R. Abu-Rustum ◽  
Xin Pei ◽  
...  

PURPOSE Endometrial cancer (EC) is not considered a component of the hereditary breast and ovarian cancer syndrome but can arise in patients with germline BRCA1/2 (g BRCA1/2) mutations. Biallelic BRCA1/2 alterations are associated with genomic features of homologous recombination DNA repair deficiency (HRD) in cancer. We sought to determine if ECs in g BRCA1/2 mutation carriers harbor biallelic alterations and/or features of HRD. METHODS Of 769 patients with EC who underwent germline panel testing, 10 pathogenic g BRCA1/2 mutation carriers were identified, and their tumor- and normal-derived DNA was subjected to massively parallel sequencing targeting at least 410 cancer-related genes. Three g BRCA1/2-associated ECs were identified in 232 ECs subjected to whole-exome sequencing by The Cancer Genome Atlas. Somatic mutations, copy number alterations, loss of heterozygosity, microsatellite instability (MSI), and genomic HRD features were assessed. RESULTS Of the 13 patients included who had EC, eight harbored pathogenic g BRCA1 mutations and five harbored g BRCA2 mutations. Eight (100%) and two (40%) ECs harbored biallelic BRCA1 and BRCA2 alterations through loss of heterozygosity of the wild-type allele. All ECs harbored somatic TP53 mutations. One monoallelic/sporadic g BRCA2-associated EC had MLH1 promoter methylation and was MSI high. High large-scale state transition scores, a genomic feature of HRD, were found only in ECs with bi- but not monoallelic BRCA1/2 alterations. The Signature Multivariate Analysis HRD signature Sig3 was enriched in biallelic g BRCA1/2 ECs, and the three ECs from The Cancer Genome Atlas with BRCA1 biallelic alterations subjected to whole-exome sequencing displayed a dominant HRD-related mutational signature 3. CONCLUSION A subset of g BRCA1/2-associated ECs harbor biallelic BRCA1/2 alterations and genomic features of HRD, which may benefit from homologous recombination–directed treatment regimens. ECs in BRCA2 mutation carriers might be sporadic and even MSI high, and may potentially benefit from immune-checkpoint inhibition.


Sign in / Sign up

Export Citation Format

Share Document