scholarly journals A comprehensive benchmarking of WGS-based structural variant callers

Author(s):  
Varuni Sarwal ◽  
Sebastian Niehus ◽  
Ram Ayyala ◽  
Sei Chang ◽  
Angela Lu ◽  
...  

AbstractAdvances in whole genome sequencing promise to enable the accurate and comprehensive structural variant (SV) discovery. Dissecting SVs from whole genome sequencing (WGS) data presents a substantial number of challenges and a plethora of SV-detection methods have been developed. Currently, there is a paucity of evidence which investigators can use to select appropriate SV-detection tools. In this paper, we evaluated the performance of SV-detection tools using a comprehensive PCR-confirmed gold standard set of SVs. In contrast to the previous benchmarking studies, our gold standard dataset included a complete set of SVs allowing us to report both precision and sensitivity rates of SV-detection methods. Our study investigates the ability of the methods to detect deletions, thus providing an optimistic estimate of SV detection performance, as the SV-detection methods that fail to detect deletions are likely to miss more complex SVs. We found that SV-detection tools varied widely in their performance, with several methods providing a good balance between sensitivity and precision. Additionally, we have determined the SV callers best suited for low and ultra-low pass sequencing data.

Author(s):  
Yongzhuang Liu ◽  
Yalin Huang ◽  
Guohua Wang ◽  
Yadong Wang

Abstract Short read whole genome sequencing has become widely used to detect structural variants in human genetic studies and clinical practices. However, accurate detection of structural variants is a challenging task. Especially existing structural variant detection approaches produce a large proportion of incorrect calls, so effective structural variant filtering approaches are urgently needed. In this study, we propose a novel deep learning-based approach, DeepSVFilter, for filtering structural variants in short read whole genome sequencing data. DeepSVFilter encodes structural variant signals in the read alignments as images and adopts the transfer learning with pre-trained convolutional neural networks as the classification models, which are trained on the well-characterized samples with known high confidence structural variants. We use two well-characterized samples to demonstrate DeepSVFilter’s performance and its filtering effect coupled with commonly used structural variant detection approaches. The software DeepSVFilter is implemented using Python and freely available from the website at https://github.com/yongzhuang/DeepSVFilter.


2019 ◽  
Author(s):  
Clare Puttick ◽  
Kishore R Kumar ◽  
Ryan L Davis ◽  
Mark Pinese ◽  
David M Thomas ◽  
...  

AbstractMotivationMitochondrial diseases (MDs) are the most common group of inherited metabolic disorders and are often challenging to diagnose due to extensive genotype-phenotype heterogeneity. MDs are caused by mutations in the nuclear or mitochondrial genome, where pathogenic mitochondrial variants are usually heteroplasmic and typically at much lower allelic fraction in the blood than affected tissues. Both genomes can now be readily analysed using unbiased whole genome sequencing (WGS), but most nuclear variant detection methods fail to detect low heteroplasmy variants in the mitochondrial genome.ResultsWe present mity, a bioinformatics pipeline for detecting and interpreting heteroplasmic SNVs and INDELs in the mitochondrial genome using WGS data. In 2,980 healthy controls, we observed on average 3,166× coverage in the mitochondrial genome using WGS from blood. mity utilises this high depth to detect pathogenic mitochondrial variants, even at low heteroplasmy. mity enables easy interpretation of mitochondrial variants and can be incorporated into existing diagnostic WGS pipelines. This could simplify the diagnostic pathway, avoid invasive tissue biopsies and increase the diagnostic rate for MDs and other conditions caused by impaired mitochondrial function.Availabilitymity is available from https://github.com/KCCG/mityunder an MIT [email protected], [email protected], [email protected]


2015 ◽  
Vol 27 (1) ◽  
pp. 14 ◽  
Author(s):  
A. Capitan ◽  
P. Michot ◽  
A. Baur ◽  
R. Saintilan ◽  
C. Hozé ◽  
...  

Fertility is a major concern in the dairy cattle industry and has been the subject of numerous studies over the past 20 years. Surprisingly, most of these studies focused on rough female phenotypes and, despite their important role in reproductive success, male- and embryo-related traits have been poorly investigated. In recent years, the rapid and important evolution of technologies in genetic research has led to the development of genomic selection. The generalisation of this method in combination with the achievements of the AI industry have led to the constitution of large databases of genotyping and sequencing data, as well as refined phenotypes and pedigree records. These resources offer unprecedented opportunities in terms of fundamental and applied research. Here we present five such examples with a focus on reproduction-related traits: (1) detection of quantitative trait loci (QTL) for male fertility and semen quality traits; (2) detection of QTL for refined phenotypes associated with female fertility; (3) identification of recessive embryonic lethal mutations by depletion of homozygous haplotypes; (4) identification of recessive embryonic lethal mutations by mining whole-genome sequencing data; and (5) the contribution of high-density single nucleotide polymorphism chips, whole-genome sequencing and imputation to increasing the power of QTL detection methods and to the identification of causal variants.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sung Yong Park ◽  
Gina Faraci ◽  
Pamela M. Ward ◽  
Jane F. Emerson ◽  
Ha Youn Lee

AbstractCOVID-19 global cases have climbed to more than 33 million, with over a million total deaths, as of September, 2020. Real-time massive SARS-CoV-2 whole genome sequencing is key to tracking chains of transmission and estimating the origin of disease outbreaks. Yet no methods have simultaneously achieved high precision, simple workflow, and low cost. We developed a high-precision, cost-efficient SARS-CoV-2 whole genome sequencing platform for COVID-19 genomic surveillance, CorvGenSurv (Coronavirus Genomic Surveillance). CorvGenSurv directly amplified viral RNA from COVID-19 patients’ Nasopharyngeal/Oropharyngeal (NP/OP) swab specimens and sequenced the SARS-CoV-2 whole genome in three segments by long-read, high-throughput sequencing. Sequencing of the whole genome in three segments significantly reduced sequencing data waste, thereby preventing dropouts in genome coverage. We validated the precision of our pipeline by both control genomic RNA sequencing and Sanger sequencing. We produced near full-length whole genome sequences from individuals who were COVID-19 test positive during April to June 2020 in Los Angeles County, California, USA. These sequences were highly diverse in the G clade with nine novel amino acid mutations including NSP12-M755I and ORF8-V117F. With its readily adaptable design, CorvGenSurv grants wide access to genomic surveillance, permitting immediate public health response to sudden threats.


Sign in / Sign up

Export Citation Format

Share Document