scholarly journals Detection of structural mosaicism from targeted and whole-genome sequencing data

2016 ◽  
Author(s):  
Daniel A. King ◽  
Alejandro Sifrim ◽  
Tomas W. Fitzgerald ◽  
Raheleh Rahbari ◽  
Emma Hobson ◽  
...  

ABSTRACTStructural mosaic abnormalities are large post-zygotic mutations present in a subset of cells and have been implicated in developmental disorders and cancer. Such mutations have been conventionally assessed in clinical diagnostics using cytogenetic or microarray testing. Modern disease studies rely heavily on exome sequencing, yet an adequate method for the detection of structural mosaicism using targeted sequencing data is lacking. Here, we present a method, called MrMosaic, to detect structural mosaic abnormalities using deviations in allele fraction and read coverage from next generation sequencing data. Whole-exome sequencing (WES) and whole-genome sequencing (WGS) simulations were used to calculate detection performance across a range of mosaic event sizes, types, clonalities, and sequencing depths. The tool was applied to 4,911 patients with undiagnosed developmental disorders, and 11 events in 9 patients were detected. In 8 of 11 cases, mosaicism was observed in saliva but not blood, suggesting that assaying blood alone would miss a large fraction, possibly more than 50%, of mosaic diagnostic chromosomal rearrangements.

2017 ◽  
Vol 114 (38) ◽  
pp. 10166-10171 ◽  
Author(s):  
Christoph Lippert ◽  
Riccardo Sabatini ◽  
M. Cyrus Maher ◽  
Eun Yong Kang ◽  
Seunghak Lee ◽  
...  

Prediction of human physical traits and demographic information from genomic data challenges privacy and data deidentification in personalized medicine. To explore the current capabilities of phenotype-based genomic identification, we applied whole-genome sequencing, detailed phenotyping, and statistical modeling to predict biometric traits in a cohort of 1,061 participants of diverse ancestry. Individually, for a large fraction of the traits, their predictive accuracy beyond ancestry and demographic information is limited. However, we have developed a maximum entropy algorithm that integrates multiple predictions to determine which genomic samples and phenotype measurements originate from the same person. Using this algorithm, we have reidentified an average of >8 of 10 held-out individuals in an ethnically mixed cohort and an average of 5 of either 10 African Americans or 10 Europeans. This work challenges current conceptions of personal privacy and may have far-reaching ethical and legal implications.


PLoS ONE ◽  
2016 ◽  
Vol 11 (6) ◽  
pp. e0157718 ◽  
Author(s):  
Martin Christen Frølund Thomsen ◽  
Johanne Ahrenfeldt ◽  
Jose Luis Bellod Cisneros ◽  
Vanessa Jurtz ◽  
Mette Voldby Larsen ◽  
...  

2020 ◽  
Author(s):  
Sihao Xiao ◽  
Zhentian Kai ◽  
David Brown ◽  
Claire L Shovlin ◽  

SUMMARYWhole genome sequencing (WGS) is championed by the UK National Health Service (NHS) to identify genetic variants that cause particular diseases. The full potential of WGS has yet to be realised as early data analytic steps prioritise protein-coding genes, and effectively ignore the less well annotated non-coding genome which is rich in transcribed and critical regulatory regions. To address, we developed a filter, which we call GROFFFY, and validated in WGS data from hereditary haemorrhagic telangiectasia patients within the 100,000 Genomes Project. Before filter application, the mean number of DNA variants compared to human reference sequence GRCh38 was 4,867,167 (range 4,786,039-5,070,340), and one-third lay within intergenic areas. GROFFFY removed a mean of 2,812,015 variants per DNA. In combination with allele frequency and other filters, GROFFFY enabled a 99.56% reduction in variant number. The proportion of intergenic variants was maintained, and no pathogenic variants in disease genes were lost. We conclude that the filter applied to NHS diagnostic samples in the 100,000 Genomes pipeline offers an efficient method to prioritise intergenic, intronic and coding gDNA variants. Reducing the overwhelming number of variants while retaining functional genome variation of importance to patients, enhances the near-term value of WGS in clinical diagnostics.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Andreas Ruscheinski ◽  
Anna Lena Reimler ◽  
Roland Ewald ◽  
Adelinde M. Uhrmacher

Abstract Background Clinical diagnostics of whole-exome and whole-genome sequencing data requires geneticists to consider thousands of genetic variants for each patient. Various variant prioritization methods have been developed over the last years to aid clinicians in identifying variants that are likely disease-causing. Each time a new method is developed, its effectiveness must be evaluated and compared to other approaches based on the most recently available evaluation data. Doing so in an unbiased, systematic, and replicable manner requires significant effort. Results The open-source test bench “VPMBench” automates the evaluation of variant prioritization methods. VPMBench introduces a standardized interface for prioritization methods and provides a plugin system that makes it easy to evaluate new methods. It supports different input data formats and custom output data preparation. VPMBench exploits declaratively specified information about the methods, e.g., the variants supported by the methods. Plugins may also be provided in a technology-agnostic manner via containerization. Conclusions VPMBench significantly simplifies the evaluation of both custom and published variant prioritization methods. As we expect variant prioritization methods to become ever more critical with the advent of whole-genome sequencing in clinical diagnostics, such tool support is crucial to facilitate methodological research.


PLoS ONE ◽  
2020 ◽  
Vol 15 (11) ◽  
pp. e0242168
Author(s):  
Julien Paganini ◽  
Peter L. Nagy ◽  
Nicholas Rouse ◽  
Philippe Gouret ◽  
Jacques Chiaroni ◽  
...  

Many questions can be explored thanks to whole-genome data. The aim of this study was to overcome their main limits, software availability and database accuracy, and estimate the feasibility of red blood cell (RBC) antigen typing from whole-genome sequencing (WGS) data. We analyzed whole-genome data from 79 individuals for HLA-DRB1 and 9 RBC antigens. Whole-genome sequencing data was analyzed with software allowing phasing of variable positions to define alleles or haplotypes and validated for HLA typing from next-generation sequencing data. A dedicated database was set up with 1648 variable positions analyzed in KEL (KEL), ACKR1 (FY), SLC14A1 (JK), ACHE (YT), ART4 (DO), AQP1 (CO), CD44 (IN), SLC4A1 (DI) and ICAM4 (LW). Whole-genome sequencing typing was compared to that previously obtained by amplicon-based monoallelic sequencing and by SNaPshot analysis. Whole-genome sequencing data were also explored for other alleles. Our results showed 93% of concordance for blood group polymorphisms and 91% for HLA-DRB1. Incorrect typing and unresolved results confirm that WGS should be considered reliable with read depths strictly above 15x. Our results supported that RBC antigen typing from WGS is feasible but requires improvements in read depth for SNV polymorphisms typing accuracy. We also showed the potential for WGS in screening donors with rare blood antigens, such as weak JK alleles. The development of WGS analysis in immunogenetics laboratories would offer personalized care in the management of RBC disorders.


Heredity ◽  
2021 ◽  
Author(s):  
Axel Jensen ◽  
Mette Lillie ◽  
Kristofer Bergström ◽  
Per Larsson ◽  
Jacob Höglund

AbstractThe use of genetic markers in the context of conservation is largely being outcompeted by whole-genome data. Comparative studies between the two are sparse, and the knowledge about potential effects of this methodology shift is limited. Here, we used whole-genome sequencing data to assess the genetic status of peripheral populations of the wels catfish (Silurus glanis), and discuss the results in light of a recent microsatellite study of the same populations. The Swedish populations of the wels catfish have suffered from severe declines during the last centuries and persists in only a few isolated water systems. Fragmented populations generally are at greater risk of extinction, for example due to loss of genetic diversity, and may thus require conservation actions. We sequenced individuals from the three remaining native populations (Båven, Emån, and Möckeln) and one reintroduced population of admixed origin (Helge å), and found that genetic diversity was highest in Emån but low overall, with strong differentiation among the populations. No signature of recent inbreeding was found, but a considerable number of short runs of homozygosity were present in all populations, likely linked to historically small population sizes and bottleneck events. Genetic substructure within any of the native populations was at best weak. Individuals from the admixed population Helge å shared most genetic ancestry with the Båven population (72%). Our results are largely in agreement with the microsatellite study, and stresses the need to protect these isolated populations at the northern edge of the distribution of the species.


Sign in / Sign up

Export Citation Format

Share Document