CanDriS: posterior profiling of cancer-driving sites based on two-component evolutionary model

Author(s):  
Wenyi Zhao ◽  
Jingwen Yang ◽  
Jingcheng Wu ◽  
Guoxing Cai ◽  
Yao Zhang ◽  
...  

Abstract Current cancer genomics databases have accumulated millions of somatic mutations that remain to be further explored. Due to the over-excess mutations unrelated to cancer, the great challenge is to identify somatic mutations that are cancer-driven. Under the notion that carcinogenesis is a form of somatic-cell evolution, we developed a two-component mixture model: while the ground component corresponds to passenger mutations, the rapidly evolving component corresponds to driver mutations. Then, we implemented an empirical Bayesian procedure to calculate the posterior probability of a site being cancer-driven. Based on these, we developed a software CanDriS (Cancer Driver Sites) to profile the potential cancer-driving sites for thousands of tumor samples from the Cancer Genome Atlas and International Cancer Genome Consortium across tumor types and pan-cancer level. As a result, we identified that approximately 1% of the sites have posterior probabilities larger than 0.90 and listed potential cancer-wide and cancer-specific driver mutations. By comprehensively profiling all potential cancer-driving sites, CanDriS greatly enhances our ability to refine our knowledge of the genetic basis of cancer and might guide clinical medication in the upcoming era of precision medicine. The results were displayed in a database CandrisDB (http://biopharm.zju.edu.cn/candrisdb/).

2020 ◽  
Author(s):  
Xun Gu

AbstractCurrent cancer genomics databases have accumulated millions of somatic mutations that remain to be further explored, faciltating enormous high throuput analyses to explore the underlying mechanisms that may contribute to malignant initiation or progression. In the context of over-dominant passenger mutations (unrelated to cancers), the challenge is to identify somatic mutations that are cancer-driving. Under the notion that carcinogenesis is a form of somatic-cell evolution, we developed a two-component mixture model that enables to accomplish the following analyses. (i) We formulated a quasi-likelihood approach to test whether the two-component model is significantly better than a single-component model, which can be used for new cancer gene predicting. (ii) We implemented an empirical Bayesian method to calculate the posterior probabilities of a site to be cancer-driving for all sites of a gene, which can be used for new driving site predicting. (iii) We developed a computational procedure to calculate the somatic selection intensity at driver sites and passenger sites, respectively, as well as site-specific profiles for all sites. Using these newly-developed methods, we comprehensively analyzed 294 known cancer genes based on The Cancer Genome Atlas (TCGA) database.


2020 ◽  
Vol 49 (D1) ◽  
pp. D1289-D1301 ◽  
Author(s):  
Tao Wang ◽  
Shasha Ruan ◽  
Xiaolu Zhao ◽  
Xiaohui Shi ◽  
Huajing Teng ◽  
...  

Abstract The prevalence of neutral mutations in cancer cell population impedes the distinguishing of cancer-causing driver mutations from passenger mutations. To systematically prioritize the oncogenic ability of somatic mutations and cancer genes, we constructed a useful platform, OncoVar (https://oncovar.org/), which employed published bioinformatics algorithms and incorporated known driver events to identify driver mutations and driver genes. We identified 20 162 cancer driver mutations, 814 driver genes and 2360 pathogenic pathways with high-confidence by reanalyzing 10 769 exomes from 33 cancer types in The Cancer Genome Atlas (TCGA) and 1942 genomes from 18 cancer types in International Cancer Genome Consortium (ICGC). OncoVar provides four points of view, ‘Mutation’, ‘Gene’, ‘Pathway’ and ‘Cancer’, to help researchers to visualize the relationships between cancers and driver variants. Importantly, identification of actionable driver alterations provides promising druggable targets and repurposing opportunities of combinational therapies. OncoVar provides a user-friendly interface for browsing, searching and downloading somatic driver mutations, driver genes and pathogenic pathways in various cancer types. This platform will facilitate the identification of cancer drivers across individual cancer cohorts and helps to rank mutations or genes for better decision-making among clinical oncologists, cancer researchers and the broad scientific community interested in cancer precision medicine.


2014 ◽  
Author(s):  
Endre Sebestyén ◽  
Michał Zawisza ◽  
Eduardo Eyras

Cancer genomics has been instrumental to determine the genetic alterations that are predictive of various tumor conditions. However, the majority of these alterations occur at low frequencies, motivating the need to expand the catalogue of cancer signatures. Alternative pre-mRNA splicing alterations, which bear major importance for the understanding of cancer, have not been exhaustively studied yet in the context of recent cancer genome projects. In this article we analyze RNA sequencing data for more than 4000 samples from The Cancer Genome Atlas (TCGA) project, including paired normal samples, to detect recurrent alternative splicing isoform switches in 9 different cancer types. We first investigate whether alternative splicing isoform changes are predictive of tumors by applying a rank-based algorithm based on the reversal of the relative expression of transcript isoforms. We find that consistent alternative splicing isoform changes can separate with high accuracy tumor and normal samples, as well as some cancer subtypes. We then searched for those changes that occur in the most abundant isoform, i.e isoform switches, and are therefore more likely to have a functional impact. In total we detected 244 isoform switches, which are associated to functional pathways that are frequently altered in cancer and also separate tumor and normal samples accurately. We further assessed whether these isoform changes are associated to somatic mutations. Surprisingly, only a few cases appear to have association, including the putative tumor suppressor FBLN2 and the tumor driver MYH11, which show association of an isoform switch to mutations and indels on the alternatively spliced exon. However, the number of observed mutations is in general not sufficient to explain the frequency of the found isoform switches, suggesting that recurrent isoform switching in cancer is mostly independent of somatic mutations. In summary, we present an effective approach to detect novel alternative splicing signatures that are predictive of tumors. Moreover, the same methodology has led to uncover recurrent isoform switches in tumors, which may provide novel prognostic and therapeutic targets. Software and data are available at: https://bitbucket.org/regulatorygenomicsupf/iso-ktsp and http://dx.doi.org/10.6084/m9.figshare.1061917


2015 ◽  
Author(s):  
Radhakrishnan Sabarinathan ◽  
Loris Mularoni ◽  
Jordi Deu-Pons ◽  
Abel Gonzalez-Perez ◽  
Nuria Lopez-Bigas

Somatic mutations are the driving force of cancer genome evolution. The rate of somatic mutations appears in great variability across the genome due to chromatin organization, DNA accessibility and replication timing. However, other variables that may influence the mutation rate locally, such as DNA-binding proteins, are unknown. Here we demonstrate that the rate of somatic mutations in melanoma tumors is highly increased at active Transcription Factor binding sites (TFBS) and nucleosome embedded DNA, compared to their flanking regions. Using recently available excision-repair sequencing (XR-seq) data, we show that the higher mutation rate at these sites is caused by a decrease of the levels of nucleotide excision repair (NER) activity. Therefore, our work demonstrates that DNA-bound proteins interfere with the NER machinery, which results in an increased rate of mutations at their binding sites. This finding has important implications in our understanding of mutational and DNA repair processes and in the identification of cancer driver mutations.


2022 ◽  
Vol 13 (1) ◽  
Author(s):  
John K. L. Wong ◽  
Christian Aichmüller ◽  
Markus Schulze ◽  
Mario Hlevnjak ◽  
Shaymaa Elgaafary ◽  
...  

AbstractCancer driving mutations are difficult to identify especially in the non-coding part of the genome. Here, we present sigDriver, an algorithm dedicated to call driver mutations. Using 3813 whole-genome sequenced tumors from International Cancer Genome Consortium, The Cancer Genome Atlas Program, and a childhood pan-cancer cohort, we employ mutational signatures based on single-base substitution in the context of tri- and penta-nucleotide motifs for hotspot discovery. Knowledge-based annotations on mutational hotspots reveal enrichment in coding regions and regulatory elements for 6 mutational signatures, including APOBEC and somatic hypermutation signatures. APOBEC activity is associated with 32 hotspots of which 11 are known and 11 are putative regulatory drivers. Somatic single nucleotide variants clusters detected at hypermutation-associated hotspots are distinct from translocation or gene amplifications. Patients carrying APOBEC induced PIK3CA driver mutations show lower occurrence of signature SBS39. In summary, sigDriver uncovers mutational processes associated with known and putative tumor drivers and hotspots particularly in the non-coding regions of the genome.


2017 ◽  
Author(s):  
Jaime Iranzo ◽  
Iñigo Martincorena ◽  
Eugene V. Koonin

AbstractCancer genomics has produced extensive information on cancer-associated genes but the number and specificity of cancer driver mutations remains a matter of debate. We constructed a bipartite network in which 7665 tumors from 30 cancer types are connected via shared mutations in 198 previously identified cancer-associated genes. We show that 27% of the tumors can be assigned to statistically supported modules, most of which encompass 1-2 cancer types. The rest of the tumors belong to a diffuse network component suggesting lower gene-specificity of driver mutations. Linear regression of the mutational loads in cancer-associated genes was used to estimate the number of drivers required for the onset of different cancers. The mean number of drivers is ~2, with a range of 1 to 5. Cancers that are associated to modules had more drivers than those from the diffuse network component, suggesting that unidentified and/or interchangeable drivers exist in the latter.


2021 ◽  
Author(s):  
Noor Kherreh ◽  
Siobhán Cleary ◽  
Cathal Seoighe

AbstractThe major histocompatibility (MHC) molecules are capable of presenting neoantigens resulting from somatic mutations on cell surfaces, potentially directing immune responses against cancer. This led to the hypothesis that cancer driver mutations may occur in gaps in the capacity to present neoantigens that are dependent on MHC genotype. If this is correct, it has important implications for understanding oncogenesis and may help to predict driver mutations based on genotype data. In support of this hypothesis, it has been reported that driver mutations that occur frequently tend to be poorly presented by common MHC alleles and that the capacity of a patient’s MHC alleles to present the resulting neoantigens is predictive of the driver mutations that are observed in their tumour. Here we show that these reports of a strong relationship between driver mutation occurrence and patient MHC alleles are a consequence of unjustified statistical assumptions. Our reanalysis of the data provides no evidence of an effect of MHC genotype on the oncogenic mutation landscape.


2020 ◽  
Vol 21 (15) ◽  
pp. 1073-1084
Author(s):  
Laurentijn Tilleman ◽  
Björn Heindryckx ◽  
Dieter Deforce ◽  
Filip Van Nieuwerburgh

Aim: This study provides clinicians and researchers with an informed choice between current commercially available targeted sequencing panels and exome sequencing panels in the context of pan-cancer pharmacogenetics. Materials & methods: Nine contemporary commercially available targeted pan-cancer panels and the xGen Exome Research Panel v2 were investigated to determine to what extent they cover the pharmacogenetic variant–drug interactions in five available cancer knowledgebases, and the driver mutations and fusion genes in the Cancer Genome Atlas. Results: xGen Exome Research Panel v2 and TrueSight Oncology 500 target 71.0 and 68.9% of the pharmacogenetic interactions in the available knowledgebases; and 93.7 and 86.0% of the driver mutations in the Cancer Genome Atlas, respectively. All other studied panels target lower percentages. Conclusion: Exome sequencing outperforms pan-cancer targeted sequencing panels in terms of covered cancer pharmacogenetic variant–drug interactions and pharmacogenetic cancer variants.


2015 ◽  
Author(s):  
Sunho Park ◽  
Seung-Jun Kim ◽  
Donghyeon Yu ◽  
Samuel Pena-Llopis ◽  
Jianjiong Gao ◽  
...  

Identification of altered pathways that are clinically relevant across human cancers is a key challenge in cancer genomics. We developed a network-based algorithm to integrate somatic mutation data with gene networks and pathways, in order to identify pathways altered by somatic mutations across cancers. We applied our approach to The Cancer Genome Atlas (TCGA) dataset of somatic mutations in 4,790 cancer patients with 19 different types of malignancies. Our analysis identified cancer-type-specific altered pathways enriched with known cancer-relevant genes and drug targets. Consensus clustering using gene expression datasets that included 4,870 patients from TCGA and multiple independent cohorts confirmed that the altered pathways could be used to stratify patients into subgroups with significantly different clinical outcomes. Of particular significance, certain patient subpopulations with poor prognosis were identified because they had specific altered pathways for which there are available targeted therapies. These findings could be used to tailor and intensify therapy in these patients, for whom current therapy is suboptimal.


2021 ◽  
Author(s):  
Jaime Davila ◽  
Pritha Chanana ◽  
Vivekananda Sarangi ◽  
Zach Fogarty ◽  
John Weroha ◽  
...  

Abstract Background: DNA polymerase epsilon (POLE) is encoded by the POLE gene, and POLE-driven tumors are characterized by high mutational rates. POLE-driven tumors are relatively common in endometrial and colorectal cancer, and their presence is increasingly recognized in ovarian cancer (OC) of endometrioid type. POLE-driven cases possess an abundance of TCT>TAT and TCG>TTG somatic mutations characterized by mutational signature 10 from the Catalog of Somatic Mutations in Cancer (COSMIC). By quantifying the contribution of COSMIC mutational signature 10 in RNA sequencing (RNA-seq) we set out to identify POLE-driven tumors in a set of unselected Mayo Clinic OC. Methods: Mutational profiles were calculated using expressed single-nucleotide variants (eSNV) in the Mayo Clinic OC tumors (n=195), The Cancer Genome Atlas (TCGA) OC tumors (n=419), and the Genotype-Tissue Expression (GTEx) normal ovarian tissues (n=84). Non-negative Matrix Factorization (NMF) of the mutational profiles inferred the contribution per sample of four distinct mutational signatures, one of which corresponds to COSMIC mutational signature 10. Results: In the Mayo Clinic OC cohort we identified six tumors with a predicted contribution from COSMIC mutational signature 10 of over five mutations per megabase. These six cases harbored known POLE hotspot mutations (P286R, S297F, V411L, and A456P) and were of endometrioid histotype (P=5e-04). These six tumors were hypermutated with a higher tumor mutation load (mean, 54.02 mutations per megabase) compared to non-POLE endometrioid OC cases (mean, 7.69 mutations per megabase; P=5e-04), and had an early onset (average age of patients at onset, 48.33 years) when compared to non-POLE endometrioid OC cohort (average age at onset, 60.13 years; P=.008). Samples from TCGA and GTEx had a low COSMIC signature 10 contribution (median 0.16 mutations per megabase; maximum 1.78 mutations per megabase) and carried no POLE hotspot mutations.Conclusions: From the largest cohort of RNA-seq from endometrioid OC to date (n=53), we identified six hypermutated samples likely driven by POLE (frequency, 11%). Our result suggests the clinical need to screen for POLE driver mutations in endometrioid OC, which can guide enrollment in immunotherapy clinical trials.


Sign in / Sign up

Export Citation Format

Share Document