scholarly journals CloneSig can jointly infer intra-tumor heterogeneity and mutational signature activity in bulk tumor sequencing data

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Judith Abécassis ◽  
Fabien Reyal ◽  
Jean-Philippe Vert

AbstractSystematic DNA sequencing of cancer samples has highlighted the importance of two aspects of cancer genomics: intra-tumor heterogeneity (ITH) and mutational processes. These two aspects may not always be independent, as different mutational processes could be involved in different stages or regions of the tumor, but existing computational approaches to study them largely ignore this potential dependency. Here, we present CloneSig, a computational method to jointly infer ITH and mutational processes in a tumor from bulk-sequencing data. Extensive simulations show that CloneSig outperforms current methods for ITH inference and detection of mutational processes when the distribution of mutational signatures changes between clones. Applied to a large cohort of 8,951 tumors with whole-exome sequencing data from The Cancer Genome Atlas, and on a pan-cancer dataset of 2,632 whole-genome sequencing tumor samples from the Pan-Cancer Analysis of Whole Genomes initiative, CloneSig obtains results overall coherent with previous studies.

2019 ◽  
Author(s):  
Judith Abécassis ◽  
Fabien Reyal ◽  
Jean-Philippe Vert

The possibility to sequence DNA in cancer samples has triggered much effort recently to identify the forces at the genomic level that shape tumorigenesis and cancer progression. It has resulted in novel understanding or clarification of two important aspects of cancer genomics: (i) intra-tumor heterogeneity (ITH), as captured by the variability in observed prevalences of somatic mutations within a tumor, and (ii) mutational processes, as revealed by the distribution of the types of somatic mutation and their immediate nucleotide context. These two aspects are not independent from each other, as different mutational processes can be involved in different subclones, but current computational approaches to study them largely ignore this dependency. In particular, sequential methods that first estimate subclones and then analyze the mutational processes active in each clone can easily miss changes in mutational processes if the clonal decomposition step fails, and conversely information regarding mutational signatures is overlooked during the subclonal reconstruction. To address current limitations, we present CloneSig, a new computational method to jointly infer ITH and mutational processes in a tumor from bulk-sequencing data, including whole-exome sequencing (WES) data, by leveraging their dependency. We show through an extensive benchmark on simulated samples that CloneSig is always as good as or better than state-of-the-art methods for ITH inference and detection of mutational processes. We then apply CloneSig to a large cohort of 8,954 tumors with WES data from the cancer genome atlas (TCGA), where we obtain results coherent with previous studies on whole-genome sequencing (WGS) data, as well as new promising findings. This validates the applicability of CloneSig to WES data, paving the way to its use in a clinical setting where WES is increasingly deployed nowadays.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Yu Kong ◽  
Christopher M. Rose ◽  
Ashley A. Cass ◽  
Alexander G. Williams ◽  
Martine Darwish ◽  
...  

AbstractProfound global loss of DNA methylation is a hallmark of many cancers. One potential consequence of this is the reactivation of transposable elements (TEs) which could stimulate the immune system via cell-intrinsic antiviral responses. Here, we develop REdiscoverTE, a computational method for quantifying genome-wide TE expression in RNA sequencing data. Using The Cancer Genome Atlas database, we observe increased expression of over 400 TE subfamilies, of which 262 appear to result from a proximal loss of DNA methylation. The most recurrent TEs are among the evolutionarily youngest in the genome, predominantly expressed from intergenic loci, and associated with antiviral or DNA damage responses. Treatment of glioblastoma cells with a demethylation agent results in both increased TE expression and de novo presentation of TE-derived peptides on MHC class I molecules. Therapeutic reactivation of tumor-specific TEs may synergize with immunotherapy by inducing inflammation and the display of potentially immunogenic neoantigens.


2018 ◽  
Vol 17 ◽  
pp. 117693511877478 ◽  
Author(s):  
Jovan Cejovic ◽  
Jelena Radenkovic ◽  
Vladimir Mladenovic ◽  
Adam Stanojevic ◽  
Milica Miletic ◽  
...  

Increased efforts in cancer genomics research and bioinformatics are producing tremendous amounts of data. These data are diverse in origin, format, and content. As the amount of available sequencing data increase, technologies that make them discoverable and usable are critically needed. In response, we have developed a Semantic Web–based Data Browser, a tool allowing users to visually build and execute ontology-driven queries. This approach simplifies access to available data and improves the process of using them in analyses on the Seven Bridges Cancer Genomics Cloud (CGC; www.cancergenomicscloud.org ). The Data Browser makes large data sets easily explorable and simplifies the retrieval of specific data of interest. Although initially implemented on top of The Cancer Genome Atlas (TCGA) data set, the Data Browser’s architecture allows for seamless integration of other data sets. By deploying it on the CGC, we have enabled remote researchers to access data and perform collaborative investigations.


NAR Cancer ◽  
2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Chie Kikutake ◽  
Minako Yoshihara ◽  
Mikita Suyama

Abstract Cancer-related mutations have been mainly identified in protein-coding regions. Recent studies have demonstrated that mutations in non-coding regions of the genome could also be a risk factor for cancer. However, the non-coding regions comprise 98% of the total length of the human genome and contain a huge number of mutations, making it difficult to interpret their impacts on pathogenesis of cancer. To comprehensively identify cancer-related non-coding mutations, we focused on recurrent mutations in non-coding regions using somatic mutation data from COSMIC and whole-genome sequencing data from The Cancer Genome Atlas (TCGA). We identified 21 574 recurrent mutations in non-coding regions that were shared by at least two different samples from both COSMIC and TCGA databases. Among them, 580 candidate cancer-related non-coding recurrent mutations were identified based on epigenomic and chromatin structure datasets. One of such mutation was located in RREB1 binding site that is thought to interact with TEAD1 promoter. Our results suggest that mutations may disrupt the binding of RREB1 to the candidate enhancer region and increase TEAD1 expression levels. Our findings demonstrate that non-coding recurrent mutations and coding mutations may contribute to the pathogenesis of cancer.


Open Medicine ◽  
2021 ◽  
Vol 16 (1) ◽  
pp. 459-463
Author(s):  
Arash Hooshmand

Abstract A new logistic regression-based method to distinguish between cancerous and noncancerous RNA genomic data is developed and tested with 100% precision on 595 healthy and cancerous prostate samples. A logistic regression system is developed and trained using whole-exome sequencing data at a high-level, i.e., normalized quantification of RNAs obtained from 495 prostate cancer samples from The Cancer Genome Atlas and 100 healthy samples from the Genotype-Tissue Expression project. We could show that both sensitivity and specificity of the method in the classification of cancerous and noncancerous cells are perfectly 100%.


2019 ◽  
Author(s):  
William C. Wright ◽  
Taosheng Chen

Abstract Here we obtained RNA-sequencing data from the publicly-available Pan-Cancer analysis project performed by The Cancer Genome Atlas (TCGA). Data within this project were processed the same experimentally, and analyzed downstream by the UCSC Toil recompute project. We reprocessed the resulting gene count files in batch to obtain normalized expression, which is a step critical for proper and comparable interpretation. We describe the linear modeling and normalization protocol, and provide an example of plotting the results using a gene of interest. We perform the entire protocol using freely available packages within the R framework.


2022 ◽  
Vol 22 (1) ◽  
Author(s):  
Lingyan Chen ◽  
Jianfeng Dong ◽  
Zeying Li ◽  
Yu Chen ◽  
Yan Zhang

Abstract Background It has been revealed that B7H4 is negatively correlated with PDL1 and identifies immuno-cold tumors in glioma. However, the application of the B7H4-PDL1 classifier in cancers has not been well testified. Methods A pan-cancer analysis was conducted to evaluate the immunological role of B7H4 using the RNA-sequencing data downloaded from the Cancer Genome Atlas (TCGA). Immunohistochemistry (IHC) and multiplexed quantitative immunofluorescence (QIF) were performed to validate the primary results revealed by bioinformatics analysis. Results The pan-cancer analysis revealed that B7H4 was negatively correlated with PDL1 expression and immune cell infiltration in CeCa. In addition, patients with high B7H4 exhibited the shortest overall survival (OS) and relapse-free survival (RFS) while those with high PDL1 exhibited a better prognosis. Multiplexed QIF showed that B7H4 was mutually exclusive with PDL1 expression and the B7H4-high group exhibited the lowest CD8 + T cell infiltration. Besides, B7H4-high predicted highly proliferative subtypes, which expressed the highest Ki67 antigen. Moreover, B7H4-high also indicated a lower response to multiple therapies. Conclusions Totally, the B7H4-PDL1 classifier identifies the immunogenicity and predicts proliferative subtypes and limited therapeutic options in CeCa, which may be a convenient and feasible biomarker in clinical practice.


2018 ◽  
Author(s):  
Yu Kong ◽  
Chris Rose ◽  
Ashley A. Cass ◽  
Martine Darwish ◽  
Steve Lianoglou ◽  
...  

AbstractProfound loss of DNA methylation is a well-recognized hallmark of cancer. Given its role in silencing transposable elements (TEs), we hypothesized that extensive TE expression occurs in tumors with highly demethylated DNA. We developed REdiscoverTE, a computational method for quantifying genome-wide TE expression in RNA sequencing data. Using The Cancer Genome Atlas database, we observed increased expression of over 400 TE subfamilies, of which 262 appeared to result from a proximal loss of DNA methylation. The most recurrent TEs were among the evolutionarily youngest in the genome, predominantly expressed from intergenic loci, and associated with antiviral or DNA damage responses. Treatment of glioblastoma cells with a demethylation agent resulted in both increased TE expression and de novo presentation of TE-derived peptides on MHC class I molecules. Therapeutic reactivation of tumor-specific TEs may synergize with immunotherapy by inducing both inflammation and the display of potentially immunogenic neoantigens.One Sentence SummaryTransposable element expression in tumors is associated with increased immune response and provides tumor-associated antigens


2019 ◽  
Author(s):  
Margaret Linan ◽  
Junwen Wang ◽  
Valentin Dinu

AbstractWe performed a comprehensive pan-cancer analysis in the Cancer Genomics Cloud of HTSeq-FPKM normalized protein coding mRNA data from 17 cancer projects in the Cancer Genome Atlas, these are Adrenal Gland, Bile Duct, Bladder, Brain, Breast, Cervix, Colorectal, Esophagus, Head and Neck, Kidney, Liver, Lung, Pancreas, Prostate, Stomach, Thyroid and Uterus. The PoTRA algorithm was applied to the normalized mRNA protein coding data and detected dysregulated pathways that can be implicated in the pathogenesis of these cancers. Then the PageRank algorithm was applied to the PoTRA results to find the most influential dysregulated pathways among all 17 cancer types. Pathways in cancer is the most common dysregulated pathway, and the MAPK signaling pathway is the most influential (PageRank score = 0.2034) while the purine metabolism pathway is the most significantly dysregulated metabolic pathway.


2019 ◽  
Vol 20 (22) ◽  
pp. 5697 ◽  
Author(s):  
Michelle E. Pewarchuk ◽  
Mateus C. Barros-Filho ◽  
Brenda C. Minatel ◽  
David E. Cohn ◽  
Florian Guisier ◽  
...  

Recent studies have uncovered microRNAs (miRNAs) that have been overlooked in early genomic explorations, which show remarkable tissue- and context-specific expression. Here, we aim to identify and characterize previously unannotated miRNAs expressed in gastric adenocarcinoma (GA). Raw small RNA-sequencing data were analyzed using the miRMaster platform to predict and quantify previously unannotated miRNAs. A discovery cohort of 475 gastric samples (434 GA and 41 adjacent nonmalignant samples), collected by The Cancer Genome Atlas (TCGA), were evaluated. Candidate miRNAs were similarly assessed in an independent cohort of 25 gastric samples. We discovered 170 previously unannotated miRNA candidates expressed in gastric tissues. The expression of these novel miRNAs was highly specific to the gastric samples, 143 of which were significantly deregulated between tumor and nonmalignant contexts (p-adjusted < 0.05; fold change > 1.5). Multivariate survival analyses showed that the combined expression of one previously annotated miRNA and two novel miRNA candidates was significantly predictive of patient outcome. Further, the expression of these three miRNAs was able to stratify patients into three distinct prognostic groups (p = 0.00003). These novel miRNAs were also present in the independent cohort (43 sequences detected in both cohorts). Our findings uncover novel miRNA transcripts in gastric tissues that may have implications in the biology and management of gastric adenocarcinoma.


Sign in / Sign up

Export Citation Format

Share Document