scholarly journals TrackSig: reconstructing evolutionary trajectories of mutations in cancer

2018 ◽  
Author(s):  
Yulia Rubanova ◽  
Ruian Shi ◽  
Caitlin F Harrigan ◽  
Roujia Li ◽  
Jeff Wintersinger ◽  
...  

ABSTRACTWe present a new method, TrackSig, to estimate the evolutionary trajectories of signatures of different somatic mutational processes from DNA sequencing data from a single, bulk tumour sample. TrackSig uses probability distributions over mutation types, called mutational signatures, to represent different mutational processes and detects the changes in the signature activity using an optimal segmentation algorithm that groups somatic mutations based on their estimated cancer cellular fraction (CCF) and their mutation type (e.g. CAG->CTG). We use two different simulation frameworks to assess both TrackSig’s reconstruction accuracy and its robustness to violations of its assumptions, as well as to compare it to a baseline approach. We find 2-4% median error in reconstructing the signature activities on simulations with varying difficulty with one to three subclones at an average depth of 30x. The size and the direction of the activity change is consistent in 83% and 95% of cases respectively. There were an average of 0.02 missed and 0.12 false positive subclones per sample. In our simulations, grouping mutations by mutation type (TrackSig), rather than by clustering CCF (baseline strategy), performs better at estimating signature activities and at identifying subclonal populations in the complex scenarios like branching, CNA gain, violation of infinite site assumption, and the inclusion of neutrally evolving mutations. TrackSig is open source software, freely available at https://github.com/morrislab/TrackSig.

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Maria Cartolano ◽  
Nima Abedpour ◽  
Viktor Achter ◽  
Tsun-Po Yang ◽  
Sandra Ackermann ◽  
...  

Abstract The identification of the mutational processes operating in tumour cells has implications for cancer diagnosis and therapy. These processes leave mutational patterns on the cancer genomes, which are referred to as mutational signatures. Recently, 81 mutational signatures have been inferred using computational algorithms on sequencing data of 23,879 samples. However, these published signatures may not always offer a comprehensive view on the biological processes underlying tumour types that are not included or underrepresented in the reference studies. To circumvent this problem, we designed CaMuS (Cancer Mutational Signatures) to construct de novo signatures while simultaneously fitting publicly available mutational signatures. Furthermore, we propose to estimate signature similarity by comparing probability distributions using the Hellinger distance. We applied CaMuS to infer signatures of mutational processes in poorly studied cancer types. We used whole genome sequencing data of 56 neuroblastoma, thus providing evidence for the versatility of CaMuS. Using simulated data, we compared the performance of CaMuS to sigfit, a recently developed algorithm with comparable inference functionalities. CaMuS and sigfit reconstructed the simulated datasets with similar accuracy; however two main features may argue for CaMuS over sigfit: (i) superior computational performance and (ii) a reliable parameter selection method to avoid spurious signatures.


2014 ◽  
Author(s):  
Julian S. Gehring ◽  
Bernd Fischer ◽  
Michael Lawrence ◽  
Wolfgang Huber

Mutational signatures are patterns in the occurrence of somatic single nucleotide variants (SNVs) that can reflect underlying mutational processes. The SomaticSignatures package provides flexible, interoperable, and easy-to-use tools that identify such signatures in cancer sequencing data. It facilitates large-scale, cross-dataset estimation of mutational signatures, implements existing methods for pattern decomposition, supports extension through user-defined methods and integrates with Bioconductor workflows. The R package SomaticSignatures is available as part of the Bioconductor project (R Core Team, 2014; Gentleman et al., 2004). Its documentation provides additional details on the methodology and demonstrates applications to biological datasets.


Science ◽  
2021 ◽  
pp. eaba7408
Author(s):  
Vladimir B. Seplyarskiy ◽  
Ruslan A. Soldatov ◽  
Evan Koch ◽  
Ryan J. McGinty ◽  
Jakob M. Goldmann ◽  
...  

Biological mechanisms underlying human germline mutations remain largely unknown. We statistically decompose variation in the rate and spectra of mutations along the genome using volume-regularized nonnegative matrix factorization. The analysis of a sequencing dataset (TOPMed) reveals nine processes that explain the variation in mutation properties between loci. We provide a biological interpretation for seven of these processes. We associate one process with bulky DNA lesions that resolve asymmetrically with respect to transcription and replication. Two processes track direction of replication fork and replication timing, respectively. We identify a mutagenic effect of active demethylation primarily acting in regulatory regions and a mutagenic effect of LINE repeats. We localize a mutagenic process specific to oocytes from population sequencing data. This process appears transcriptionally asymmetric.


Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 4102-4102
Author(s):  
Julieta Haydee Sepulveda Yanez ◽  
Diego Alvarez ◽  
Jose Fernandez-Goycoolea ◽  
Cornelis A.M. van Bergen ◽  
Hendrik Veelken ◽  
...  

Abstract Introduction: In recent years, strategies have been developed to identify specific mutation patterns within next-generation sequencing data. Distinct mutational patterns can be linked to underlying mutagenic processes in human cancer. One approach analyzes single base substitutions in the context of their neighboring bases as trinucleotides. The relative prevalence of all possible 96 altered trinucleotides defines distinctive mutational signatures. The activity of activation-induced cytidine deaminase (AID) initiates a specific mutational process in B cells. AID induces deamination of deoxycytidine into deoxyuridine. Subsequent mechanisms to repair the resulting mismatch lead to different genomic alterations that can be assigned to three mutational signatures: a canonical signature characterized by C>T/G transitions at WRCY motifs, a non-canonical signature defined by A>C transversions at WAN motifs, and a third AID signature characterized by C>T transitions at RCG motifs with preference for methylated CpG (W: A or T; R: purine; Y: pyrimidine, N: any nucleotide). The latter signature has specifically been designated as AID-mediated CpG-methylation-dependent mutagenesis. AID activity has been linked to the pathogenesis of several B-cell lymphomas, including follicular lymphoma (FL), chronic lymphocytic leukemia (CLL), and mantle cell lymphoma (MCL). Therefore, we searched for the contribution of different AID signatures in these B-cell malignancies. Methods: We analyzed the mutational landscape in whole exome (WES) and whole genome (WGS) sequencing data from 41 FL, 30 CLL, 2 MBL, and 43 MCL cases. Somatic variants were called by comparison of tumor and germline DNA with an in-house developed pipeline. Mutational signatures were defined according to the 96-base substitution model (Alexandrov et al. 2013) by an unsupervised machine learning with implementation of the SomaticSignatures R package (Gehring et al. 2015). In addition, MutationalPattern R package (Blokzijl et al. 2018) was executed for comparison to mutational signatures defined in COSMIC. Results: In unsupervised analyses of FL, CLL/MBL, and MCL cases, 77% of the mutation spectrum variance was attributable to four signatures (S1-4). In FL, the mutational landscape was dominated by S4 characterized by mutations in both canonical and non-canonical AID motifs (40%, 95% CI: 35-76%). The second most frequent signature (S2; 27%, 21-49%) was characterized by C>A transitions in the context of the non-canonical AID and the CpG hotspot motifs (RCG). The mutational landscape of CLL and MBL was strongly dominated by signature S3 (50%, 45-95%). S3 contains mutations in RCG motifs as well as mutations in non-canonical AID motifs (NTW), but with a lower contribution that in S4. In contrast, the mutational landscape of MCL was dominated by S1 (31%, 24-55%) characterized by C>T transitions in the RCG motif in addition to a striking prevalence of the TCT>TTT transition that is known to be associated with the activity of APOBEC enzymes. In comparison to the mutational signatures in COSMIC, the lymphomas analyzed here carry a strong similarity to the COSMIC signatures 1, 5, and 25. These signatures are observed across a wide spectrum of cancer types and are either of unknown etiology (S5 and S25) or associated with age (S1). Conclusions: The most common point mutations in CLL/MBL and FL are C>T transitions and indicate a strong influence of AID on their mutational landscape. In the indolent B-cell malignancies, all three known AID-related signatures, i.e. canonical, non-canonical, and CpG-methylation-dependent can be found. In contrast, the genomic landscape of MCL is dominated by variants in CpG-methylation-dependent mutagenesis sites and by an APOBEC-related motif. In addition to AID-related signatures, we also found consensus signatures described in COSMIC such as the age-related spontaneous deamination signature 1. Our work independently confirms the role of AID in B-cell lymphoma pathogenesis but points to disease-specific mechanisms that modulate AID in the respective lymphoma cell of origin. In addition, our data suggest that distinctive repair mechanisms operate in different entities. Disclosures No relevant conflicts of interest to declare.


2016 ◽  
Author(s):  
Ludmil B. Alexandrov ◽  
Young Seok Ju ◽  
Kerstin Haase ◽  
Peter Van Loo ◽  
Iñigo Martincorena ◽  
...  

ABSTRACTTobacco smoking increases the risk of at least 15 classes of cancer. We analyzed somatic mutations and DNA methylation in 5,243 cancers of types for which tobacco smoking confers an elevated risk. Smoking is associated with increased mutation burdens of multiple distinct mutational signatures, which contribute to different extents in different cancers. One of these signatures, mainly found in cancers derived from tissues directly exposed to tobacco smoke, is attributable to misreplication of DNA damage caused by tobacco carcinogens. Others likely reflect indirect activation of DNA editing by APOBEC cytidine deaminases and of an endogenous clock-like mutational process. The results are consistent with the proposition that smoking increases cancer risk by increasing the somatic mutation load, although direct evidence for this mechanism is lacking in some smoking-related cancer types.ONE SENTENCE SUMMARYMultiple distinct mutational processes associate with tobacco smoking in cancer reflecting direct and indirect effects of tobacco smoke.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Jie Tang ◽  
Kailing Tu ◽  
Keying Lu ◽  
Jiaxun Zhang ◽  
Kai Luo ◽  
...  

Abstract Background Colorectal cancer (CRC) is a major cancer type whose mechanism of metastasis remains elusive. Methods In this study, we characterised the evolutionary pattern of metastatic CRC (mCRC) by analysing bulk and single-cell exome sequencing data of primary and metastatic tumours from 7 CRC patients with liver metastases. Here, 7 CRC patients were analysed by bulk whole-exome sequencing (WES); 4 of these were also analysed using single-cell sequencing. Results Despite low genomic divergence between paired primary and metastatic cancers in the bulk data, single-cell WES (scWES) data revealed rare mutations and defined two separate cell populations, indicative of the diverse evolutionary trajectories between primary and metastatic tumour cells. We further identified 24 metastatic cell-specific-mutated genes and validated their functions in cell migration capacity. Conclusions In summary, scWES revealed rare mutations that failed to be detected by bulk WES. These rare mutations better define the distinct genomic profiles of primary and metastatic tumour cell clones.


2019 ◽  
Author(s):  
Zsofia Sztupinszki ◽  
Miklos Diossy ◽  
Marcin Krzystanek ◽  
Judit Borcsok ◽  
Mark Pomerantz ◽  
...  

AbstractBackgroundProstate cancers with mutations in genes involved in homologous recombination (HR), most commonly BRCA2, respond favorably to PARP inhibition and platinum-based chemotherapy. It is not clear, however, whether other prostate tumors that do not harbor deleterious mutations in these particular genes can similarly be deficient in HR, rendering them sensitive to HR-directed therapies.To identify a more comprehensive set of prostate cancer cases with homologous recombination deficiency (HRD) including those cases that do not harbor mutations in known HR genes.HRD levels can be estimated using various mutational signatures derived from next-generation sequencing data. We used this approach to determine whether prostate cancer cases display clear signs of HRD in somatic tumor biopsies. Whole genome (n=311) and whole exome sequencing data (n=498) of both primary and metastatic prostate adenocarcinomas (PRAD) were analyzed.ResultsKnown BRCA-deficient samples showed robust signs of HR-deficiency associated mutational signatures. HRD-patterns were also detected in a subset of patients who did not harbor germline or somatic mutations in BRCA1/2 or other HR related genes. Patients with HRD signatures had a significantly worse prognosis than patients without signs of HRD.ConclusionsThese findings may expand the number of cases likely to respond to PARP-inhibitor treatment. Based on the HRD associated mutational signatures, 5-8 % of prostate cancer cases may be good candidates for PARP-inhibitor treatment (including those with BRCA1/2 mutations).


2018 ◽  
Author(s):  
Quinn K. Langdon ◽  
David Peris ◽  
Brian Kyle ◽  
Chris Todd Hittinger

AbstractThe genomics era has expanded our knowledge about the diversity of the living world, yet harnessing high-throughput sequencing data to investigate alternative evolutionary trajectories, such as hybridization, is still challenging. Here we present sppIDer, a pipeline for the characterization of interspecies hybrids and pure species,that illuminates the complete composition of genomes. sppIDer maps short-read sequencing data to a combination genome built from reference genomes of several species of interest and assesses the genomic contribution and relative ploidy of each parental species, producing a series of colorful graphical outputs ready for publication. As a proof-of-concept, we use the genus Saccharomyces to detect and visualize both interspecies hybrids and pure strains, even with missing parental reference genomes. Through simulation, we show that sppIDer is robust to variable reference genome qualities and performs well with low-coverage data. We further demonstrate the power of this approach in plants, animals, and other fungi. sppIDer is robust to many different inputs and provides visually intuitive insight into genome composition that enables the rapid identification of species and their interspecies hybrids. sppIDer exists as a Docker image, which is a reusable, reproducible, transparent, and simple-to-run package that automates the pipeline and installation of the required dependencies (https://github.com/GLBRC/sppIDer).


2021 ◽  
Author(s):  
Freek Manders ◽  
Arianne M. Brandsma ◽  
Jurrian de Kanter ◽  
Mark Verheul ◽  
Rurika Oka ◽  
...  

Background: The collective of somatic mutations in a genome represents a record of mutational processes that have been operative in a cell. These processes can be investigated by extracting relevant mutational patterns from sequencing data. Results: Here, we present the next version of MutationalPatterns, an R/Bioconductor package, which allows in-depth mutational analysis of catalogues of single and double base substitutions as well as small insertions and deletions. Major features of the package include the possibility to perform regional mutation spectra analyses and the possibility to detect strand asymmetry phenomena, such as lesion segregation. On top of this, the package also contains functions to determine how likely it is that a signature can cause damaging mutations (i.e., mutations that affect protein function). This updated package supports stricter signature refitting on known signatures in order to prevent overfitting. Using simulated mutation matrices containing varied signature contributions, we showed that reliable refitting can be achieved even when only 50 mutations are present per signature. Additionally, we incorporated bootstrapped signature refitting to assess the robustness of the signature analyses. Finally, we applied the package on genome mutation data of cell lines in which we deleted specific DNA repair processes and on large cancer datasets, to show how the package can be used to generate novel biological insights. Conclusions: This novel version of MutationalPatterns allows for more comprehensive analyses and visualization of mutational patterns in order to study the underlying processes. Ultimately, in-depth mutational analyses may contribute to improved biological insights in mechanisms of mutation accumulation as well as aid cancer diagnostics. MutationalPatterns is freely available at http://bioconductor.org/packages/MutationalPatterns.


Sign in / Sign up

Export Citation Format

Share Document