scholarly journals Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Matthew H. Bailey ◽  
◽  
William U. Meyerson ◽  
Lewis Jonathan Dursi ◽  
Liang-Bo Wang ◽  
...  

AbstractThe Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts.

2015 ◽  
Vol 112 (4) ◽  
pp. 1107-1112 ◽  
Author(s):  
Kexin Chen ◽  
Da Yang ◽  
Xiangchun Li ◽  
Baocun Sun ◽  
Fengju Song ◽  
...  

Gastric cancer (GC) is a highly heterogeneous disease. To identify potential clinically actionable therapeutic targets that may inform individualized treatment strategies, we performed whole-exome sequencing on 78 GCs of differing histologies and anatomic locations, as well as whole-genome sequencing on two GC cases, each with three primary tumors and two matching lymph node metastases. The data showed two distinct GC subtypes with either high-clonality (HiC) or low-clonality (LoC). The HiC subtype of intratumoral heterogeneity was associated with older age, TP53 (tumor protein P53) mutation, enriched C > G transition, and significantly shorter survival, whereas the LoC subtype was associated with younger age, ARID1A (AT rich interactive domain 1A) mutation, and significantly longer survival. Phylogenetic tree analysis of whole-genome sequencing data from multiple samples of two patients supported the clonal evolution of GC metastasis and revealed the accumulation of genetic defects that necessitate combination therapeutics. The most recurrently mutated genes, which were validated in a separate cohort of 216 cases by targeted sequencing, were members of the homologous recombination DNA repair, Wnt, and PI3K-ERBB pathways. Notably, the drugable NRG1 (neuregulin-1) and ERBB4 (V-Erb-B2 avian erythroblastic leukemia viral oncogene homolog 4) ligand-receptor pair were mutated in 10% of GC cases. Mutations of the BRCA2 (breast cancer 2, early onset) gene, found in 8% of our cohort and validated in The Cancer Genome Atlas GC cohort, were associated with significantly longer survivals. These data define distinct clinicogenetic forms of GC in the Chinese population that are characterized by specific mutation sets that can be investigated for efficacy of single and combination therapies.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Andreas Ruscheinski ◽  
Anna Lena Reimler ◽  
Roland Ewald ◽  
Adelinde M. Uhrmacher

Abstract Background Clinical diagnostics of whole-exome and whole-genome sequencing data requires geneticists to consider thousands of genetic variants for each patient. Various variant prioritization methods have been developed over the last years to aid clinicians in identifying variants that are likely disease-causing. Each time a new method is developed, its effectiveness must be evaluated and compared to other approaches based on the most recently available evaluation data. Doing so in an unbiased, systematic, and replicable manner requires significant effort. Results The open-source test bench “VPMBench” automates the evaluation of variant prioritization methods. VPMBench introduces a standardized interface for prioritization methods and provides a plugin system that makes it easy to evaluate new methods. It supports different input data formats and custom output data preparation. VPMBench exploits declaratively specified information about the methods, e.g., the variants supported by the methods. Plugins may also be provided in a technology-agnostic manner via containerization. Conclusions VPMBench significantly simplifies the evaluation of both custom and published variant prioritization methods. As we expect variant prioritization methods to become ever more critical with the advent of whole-genome sequencing in clinical diagnostics, such tool support is crucial to facilitate methodological research.


2013 ◽  
Vol 88 (1) ◽  
pp. 774-774 ◽  
Author(s):  
E. S. Amirian ◽  
M. L. Bondy ◽  
Q. Mo ◽  
M. N. Bainbridge ◽  
M. E. Scheurer

2018 ◽  
Author(s):  
Anna Supernat ◽  
Oskar Valdimar Vidarsson ◽  
Vidar M. Steen ◽  
Tomasz Stokowy

ABSTRACTTesting of patients with genetics-related disorders is in progress of shifting from single gene assays to gene panel sequencing, whole-exome sequencing (WES) and whole-genome sequencing (WGS). Since WGS is unquestionably becoming a new foundation for molecular analyses, we decided to compare three currently used tools for variant calling of human whole genome sequencing data. We tested DeepVariant, a new TensorFlow machine learning-based variant caller, and compared this tool to GATK 4.0 and SpeedSeq, using 30×, 15× and 10× WGS data of the well-known NA12878 DNA reference sample.According to our comparison, the performance on SNV calling was almost similar in 30× data, with all three variant callers reaching F-Scores (i.e. harmonic mean of recall and precision) equal to 0.98. In contrast, DeepVariant was more precise in indel calling than GATK and SpeedSeq, as demonstrated by F-Scores of 0.94, 0.90 and 0.84, respectively.We conclude that the DeepVariant tool has great potential and usefulness for analysis of WGS data in medical genetics.


2017 ◽  
Author(s):  
Jeremiah A Wala ◽  
Ofer Shapira ◽  
Yilong Li ◽  
David Craft ◽  
Steven E Schumacher ◽  
...  

AbstractCancer cells can acquire profound alterations to the structure of their genomes, including rearrangements that fuse distant DNA breakpoints. We analyze the distribution of somatic rearrangements across the cancer genome, using whole-genome sequencing data from 2,693 tumor-normal pairs. We observe substantial variation in the density of rearrangement breakpoints, with enrichment in open chromatin and sites with high densities of repetitive elements. After accounting for these patterns, we identify significantly recurrent breakpoints (SRBs) at 52 loci, including novel SRBs near BRD4 and AKR1C3. Taking into account both loci fused by a rearrangement, we observe different signatures resembling either single breaks followed by strand invasion or two separate breaks that become joined. Accounting for these signatures, we identify 90 pairs of loci that are significantly recurrently juxtaposed (SRJs). SRJs are primarily tumor-type specific and tend to involve genes with tissue-specific expression. SRJs were frequently associated with disruption of topology-associated domains, juxtaposition of enhancer elements, and increased expression of neighboring genes. Lastly, we find that the power to detect SRJs decreases for short rearrangements, and that reliable detection of all driver SRJs will require whole-genome sequencing data from an order of magnitude more cancer samples than currently available.


Sign in / Sign up

Export Citation Format

Share Document