scholarly journals mmsig: a fitting approach to accurately identify somatic mutational signatures in hematological malignancies

2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Even H. Rustad ◽  
Ferran Nadeu ◽  
Nicos Angelopoulos ◽  
Bachisio Ziccheddu ◽  
Niccolò Bolli ◽  
...  

AbstractMutational signatures have emerged as powerful biomarkers in cancer patients, with prognostic and therapeutic implications. Wider clinical utility requires access to reproducible algorithms, which allow characterization of mutational signatures in a given tumor sample. Here, we show how mutational signature fitting can be applied to hematological cancer genomes to identify biologically and clinically important mutational processes, highlighting the importance of careful interpretation in light of biological knowledge. Our newly released R package mmsig comes with a dynamic error-suppression procedure that improves specificity in important clinical and biological settings. In particular, mmsig allows accurate detection of mutational signatures with low abundance, such as those introduced by APOBEC cytidine deaminases. This is particularly important in the most recent mutational signature reference (COSMIC v3.1) where each signature is more clearly defined. Our mutational signature fitting algorithm mmsig is a robust tool that can be implemented immediately in the clinic.

2019 ◽  
Author(s):  
Harald Vöhringer ◽  
Arne van Hoeck ◽  
Edwin Cuppen ◽  
Moritz Gerstung

AbstractMutational signature analysis is an essential part of the cancer genome analysis toolkit. Conventionally, mutational signature analysis extracts patterns of different mutation types across many cancer genomes. Here we present TensorSignatures, an algorithm to learn mutational signatures jointly across all variant categories and their genomic context. The analysis of 2,778 primary and 3,824 metastatic cancer genomes of the PCAWG consortium and the HMF cohort shows that practically all signatures operate dynamically in response to various genomic and epigenomic states. The analysis pins differential spectra of UV mutagenesis found in active and inactive chromatin to global genome nucleotide excision repair. TensorSignatures accurately characterises transcription-associated mutagenesis, which is detected in 7 different cancer types. The analysis also unmasks replication- and double strand break repair-driven APOBEC mutagenesis, which manifests with differential numbers and length of mutation clusters indicating a differential processivity of the two triggers. As a fourth example, TensorSignatures detects a signature of somatic hypermutation generating highly clustered variants around the transcription start sites of active genes in lymphoid leukaemia, distinct from a more general and less clustered signature of Polη-driven translesion synthesis found in a broad range of cancer types.Key findingsSimultaneous inference of mutational signatures across mutation types and genomic features refines signature spectra and defines their genomic determinants.Analysis of 6,602 cancer genomes reveals pervasive intra-genomic variation of mutational processes.Distinct mutational signatures found in quiescent and active regions of the genome reveal differential repair and mutagenicity of UV- and tobacco-induced DNA damage.APOBEC mutagenesis produces two signatures reflecting highly clustered, double strand break repair-initiated and lowly clustered replication-driven mutagenesis, respectively.Somatic hypermutation in lymphoid cancers produces a strongly clustered mutational signature localised to transcription start sites, which is distinct from a weakly clustered translesion synthesis signature found in multiple tumour types.


2017 ◽  
Author(s):  
Xiaoqing Huang ◽  
Damian Wojtowicz ◽  
Teresa M. Przytycka

AbstractCancers arise as the result of somatically acquired changes in the DNA of cancer cells. However, in addition to the mutations that confer a growth advantage, cancer genomes accumulate a large number of somatic mutations resulting from normal DNA damage and repair processes as well as mutations triggered by carcinogenic exposures or cancer related aberrations of DNA mainte-nance machinery. These mutagenic processes often produce characteristic mutational patterns called mutational signatures. Decomposition of cancer’s mutation catalog into mutations consistent with such signatures can provide valuable information about cancer etiology. However, the results from different decomposition methods are not always consistent. Hence, one needs to not only be able to decompose a patient’s mutational profile into signatures but also to establish the accuracy of such decomposition. We proposed two complementary ways of measuring confidence and stability of decomposition results and applied them to analyze mutational signatures in breast cancer genomes. We identified very stable and highly unstable signatures, as well as signatures that have been missed altogether. We also provided additional support for the novel signatures. Our results emphasize the importance of assessing the confidence and stability of inferred signature contributions. All tools developed in this paper have been implemented in an R package, called SignatureEstimation, which is available from https://www.ncbi.nlm.nih.gov/CBBresearch/Przytycka/index.cgi#signatureestimation.


2020 ◽  
Author(s):  
Damian Wojtowicz ◽  
Jan Hoinka ◽  
Bayarbaatar Amgalan ◽  
Yoo-Ah Kim ◽  
Teresa M. Przytycka

AbstractMany mutagenic processes leave characteristic imprints on cancer genomes known as mutational signatures. These signatures have been of recent interest regarding their applicability in studying processes shaping the mutational landscape of cancer. In particular, pinpointing the presence of altered DNA repair pathways can have important therapeutic implications. However, mutational signatures of DNA repair deficiencies are often hard to infer. This challenge emerges as a result of deficient DNA repair processes acting by modifying the outcome of other mutagens. Thus, they exhibit non-additive effects that are not depicted by the current paradigm for modeling mutational processes as independent signatures. To close this gap, we present RepairSig, a method that accounts for interactions between DNA damage and repair and is able to uncover unbiased signatures of deficient DNA repair processes. In particular, RepairSig was able to replace three MMR deficiency signatures previously proposed to be active in breast cancer, with just one signature strikingly similar to the experimentally derived signature. As the first method to model interactions between mutagenic processes, RepairSig is an important step towards biologically more realistic modeling of mutational processes in cancer. The source code for RepairSig is publicly available at https://github.com/ncbi/RepairSig.


2018 ◽  
Author(s):  
Kevin Gori ◽  
Adrian Baez-Ortega

Mutational signature analysis aims to infer the mutational spectra and relative exposures of processes that contribute mutations to genomes. Different models for signature analysis have been developed, mostly based on non-negative matrix factorisation or non-linear optimisation. Here we present sigfit, an R package for mutational signature analysis that applies Bayesian inference to perform fitting and extraction of signatures from mutation data. We compare the performance of sigfit to prominent existing software, and find that it compares favourably. Moreover, sigfit introduces novel probabilistic models that enable more robust, powerful and versatile fitting and extraction of mutational signatures and broader biological patterns. The package also provides user-friendly visualisation routines and is easily integrable with other bioinformatic packages.


2021 ◽  
Author(s):  
Mia Petljak ◽  
Kevan Chu ◽  
Alexandra Dananberg ◽  
Erik N. Bergstrom ◽  
Patrick von Morgen ◽  
...  

ABSTRACTThe APOBEC3 family of cytidine deaminases is widely speculated to be a major source of somatic mutations in cancer1–3. However, causal links between APOBEC3 enzymes and mutations in human cancer cells have not been established. The identity of the APOBEC3 paralog(s) that may act as prime drivers of mutagenesis and the mechanisms underlying different APOBEC3-associated mutational signatures are unknown. To directly investigate the roles of APOBEC3 enzymes in cancer mutagenesis, candidate APOBEC3 genes were deleted from cancer cell lines recently found to naturally generate APOBEC3-associated mutations in episodic bursts4. Deletion of the APOBEC3A paralog severely diminished the acquisition of mutations of speculative APOBEC3 origins in breast cancer and lymphoma cell lines. APOBEC3 mutational burdens were undiminished in APOBEC3B knockout cell lines. APOBEC3A deletion reduced the appearance of the clustered mutation types kataegis and omikli, which are frequently found in cancer genomes. The uracil glycosylase UNG and the translesion polymerase REV1 were found to play critical roles in the generation of mutations induced by APOBEC3A. These data represent the first evidence for a long-postulated hypothesis that APOBEC3 deaminases generate prevalent clustered and non-clustered mutational signatures in human cancer cells, identify APOBEC3A as a driver of episodic mutational bursts, and dissect the roles of the relevant enzymes in generating the associated mutations in breast cancer and B cell lymphoma cell lines.


2021 ◽  
Author(s):  
John Maciejowski ◽  
Mia Petljak ◽  
Kevan Chu ◽  
Alexandra Dananberg ◽  
Erik Bergstrom ◽  
...  

Abstract The APOBEC3 family of cytidine deaminases is widely speculated to be a major source of somatic mutations in cancer1–3. However, causal links between APOBEC3 enzymes and mutations in human cancer cells have not been established. The identity of the APOBEC3 paralog(s) that may act as prime drivers of mutagenesis and the mechanisms underlying different APOBEC3-associated mutational signatures are unknown. To directly investigate the roles of APOBEC3 enzymes in cancer mutagenesis, candidate APOBEC3 genes were deleted from cancer cell lines recently found to naturally generate APOBEC3-associated mutations in episodic bursts4. Deletion of the APOBEC3A paralog severely diminished the acquisition of mutations of speculative APOBEC3 origins in breast cancer and lymphoma cell lines. APOBEC3 mutational burdens were undiminished in APOBEC3B knockout cell lines. APOBEC3A deletion reduced the appearance of the clustered mutation types kataegis and omikli, which are frequently found in cancer genomes. The uracil glycosylase UNG and the translesion polymerase REV1 were found to play critical roles in the generation of mutations induced by APOBEC3A. These data represent the first evidence for a long-postulated hypothesis that APOBEC3 deaminases generate prevalent clustered and non-clustered mutational signatures in human cancer cells, identify APOBEC3A as a driver of episodic mutational bursts, and dissect the roles of the relevant enzymes in generating the associated mutations in breast cancer and B cell lymphoma cell lines.


2020 ◽  
Author(s):  
Julián Candia

AbstractSummarymutSigMapper aims to resolve a critical shortcoming of existing software for mutational signature analysis, namely that of finding parsimonious and biologically plausible exposures. By implementing a shot-noise-based model to generate spectral ensembles, this package addresses this gap and provides a quantitative, non-parametric assessment of statistical significance for the association between mutational signatures and observed spectra.Availability and implementationThe mutSigMapper R package is available under GPLv3 license at https://github.com/juliancandia/mutSigMapper. Its documentation provides additional details and demonstrates applications to biological datasets.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ege Ülgen ◽  
O. Uğur Sezerman

Abstract Background Cancer develops due to “driver” alterations. Numerous approaches exist for predicting cancer drivers from cohort-scale genomics data. However, methods for personalized analysis of driver genes are underdeveloped. In this study, we developed a novel personalized/batch analysis approach for driver gene prioritization utilizing somatic genomics data, called driveR. Results Combining genomics information and prior biological knowledge, driveR accurately prioritizes cancer driver genes via a multi-task learning model. Testing on 28 different datasets, this study demonstrates that driveR performs adequately, achieving a median AUC of 0.684 (range 0.651–0.861) on the 28 batch analysis test datasets, and a median AUC of 0.773 (range 0–1) on the 5157 personalized analysis test samples. Moreover, it outperforms existing approaches, achieving a significantly higher median AUC than all of MutSigCV (Wilcoxon rank-sum test p < 0.001), DriverNet (p < 0.001), OncodriveFML (p < 0.001) and MutPanning (p < 0.001) on batch analysis test datasets, and a significantly higher median AUC than DawnRank (p < 0.001) and PRODIGY (p < 0.001) on personalized analysis datasets. Conclusions This study demonstrates that the proposed method is an accurate and easy-to-utilize approach for prioritizing driver genes in cancer genomes in personalized or batch analyses. driveR is available on CRAN: https://cran.r-project.org/package=driveR.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Gwenaëlle G. Lemoine ◽  
Marie-Pier Scott-Boyer ◽  
Bathilde Ambroise ◽  
Olivier Périn ◽  
Arnaud Droit

Abstract Background Network-based analysis of gene expression through co-expression networks can be used to investigate modular relationships occurring between genes performing different biological functions. An extended description of each of the network modules is therefore a critical step to understand the underlying processes contributing to a disease or a phenotype. Biological integration, topology study and conditions comparison (e.g. wild vs mutant) are the main methods to do so, but to date no tool combines them all into a single pipeline. Results Here we present GWENA, a new R package that integrates gene co-expression network construction and whole characterization of the detected modules through gene set enrichment, phenotypic association, hub genes detection, topological metric computation, and differential co-expression. To demonstrate its performance, we applied GWENA on two skeletal muscle datasets from young and old patients of GTEx study. Remarkably, we prioritized a gene whose involvement was unknown in the muscle development and growth. Moreover, new insights on the variations in patterns of co-expression were identified. The known phenomena of connectivity loss associated with aging was found coupled to a global reorganization of the relationships leading to expression of known aging related functions. Conclusion GWENA is an R package available through Bioconductor (https://bioconductor.org/packages/release/bioc/html/GWENA.html) that has been developed to perform extended analysis of gene co-expression networks. Thanks to biological and topological information as well as differential co-expression, the package helps to dissect the role of genes relationships in diseases conditions or targeted phenotypes. GWENA goes beyond existing packages that perform co-expression analysis by including new tools to fully characterize modules, such as differential co-expression, additional enrichment databases, and network visualization.


2018 ◽  
Author(s):  
Avantika Lal ◽  
Keli Liu ◽  
Robert Tibshirani ◽  
Arend Sidow ◽  
Daniele Ramazzotti

AbstractCancer is the result of mutagenic processes that can be inferred from tumor genomes by analyzing rate spectra of point mutations, or “mutational signatures”. Here we present SparseSignatures, a novel framework to extract signatures from somatic point mutation data. Our approach incorporates DNA replication error as a background, employs regularization to reduce noise in non-background signatures, uses cross-validation to identify the number of signatures, and is scalable to large datasets. We show that SparseSignatures outperforms current state-of-the-art methods on simulated data using standard metrics. We then apply SparseSignatures to whole genome sequences of 147 tumors from pancreatic cancer, discovering 8 signatures in addition to the background.


Sign in / Sign up

Export Citation Format

Share Document