scholarly journals Enhancing the detection of barcoded reads in high throughput DNA sequencing data by controlling the false discovery rate

2014 ◽  
Vol 15 (1) ◽  
Author(s):  
Tilo Buschmann ◽  
Rong Zhang ◽  
Douglas E Brash ◽  
Leonid V Bystrykh
2011 ◽  
Vol 21 (5) ◽  
pp. 734-740 ◽  
Author(s):  
M. Hsi-Yang Fritz ◽  
R. Leinonen ◽  
G. Cochrane ◽  
E. Birney

2017 ◽  
Author(s):  
Yuchao Jiang ◽  
Rujin Wang ◽  
Eugene Urrutia ◽  
Ioannis N. Anastopoulos ◽  
Katherine L. Nathanson ◽  
...  

AbstractHigh-throughput DNA sequencing enables detection of copy number variations (CNVs) on the genome-wide scale with finer resolution compared to array-based methods, but suffers from biases and artifacts that lead to false discoveries and low sensitivity. We describe CODEX2, a statistical framework for full-spectrum CNV profiling that is sensitive for variants with both common and rare population frequencies and that is applicable to study designs with and without negative control samples. We demonstrate and evaluate CODEX2 on whole-exome and targeted sequencing data, where biases are the most prominent. CODEX2 outperforms existing methods and, in particular, significantly improves sensitivity for common CNVs.


2018 ◽  
Author(s):  
Arda Soylev ◽  
Thong Le ◽  
Hajar Amini ◽  
Can Alkan ◽  
Fereydoun Hormozdiari

AbstractMotivationSeveral algorithms have been developed that use high throughput sequencing technology to characterize structural variations. Most of the existing approaches focus on detecting relatively simple types of SVs such as insertions, deletions, and short inversions. In fact, complex SVs are of crucial importance and several have been associated with genomic disorders. To better understand the contribution of complex SVs to human disease, we need new algorithms to accurately discover and genotype such variants. Additionally, due to similar sequencing signatures, inverted duplications or gene conversion events that include inverted segmental duplications are often characterized as simple inversions; and duplications and gene conversions in direct orientation may be called as simple deletions. Therefore, there is still a need for accurate algorithms to fully characterize complex SVs and thus improve calling accuracy of more simple variants.ResultsWe developed novel algorithms to accurately characterize tandem, direct and inverted interspersed segmental duplications using short read whole genome sequencing data sets. We integrated these methods to our TARDIS tool, which is now capable of detecting various types of SVs using multiple sequence signatures such as read pair, read depth and split read. We evaluated the prediction performance of our algorithms through several experiments using both simulated and real data sets. In the simulation experiments, using a 30× coverage TARDIS achieved 96% sensitivity with only 4% false discovery rate. For experiments that involve real data, we used two haploid genomes (CHM1 and CHM13) and one human genome (NA12878) from the Illumina Platinum Genomes set. Comparison of our results with orthogonal PacBio call sets from the same genomes revealed higher accuracy for TARDIS than state of the art methods. Furthermore, we showed a surprisingly low false discovery rate of our approach for discovery of tandem, direct and inverted interspersed segmental duplications prediction on CHM1 (less than 5% for the top 50 predictions).AvailabilityTARDIS source code is available at https://github.com/BilkentCompGen/tardis, and a corresponding Docker image is available at https://hub.docker.com/r/alkanlab/tardis/[email protected] and [email protected]


2013 ◽  
Vol 30 (4) ◽  
pp. 409-415
Author(s):  
Zexuan Zhu ◽  
Yongpeng Zhang ◽  
Zhuhong You ◽  
Liang Jiang ◽  
Zhen Ji

PLoS ONE ◽  
2016 ◽  
Vol 11 (5) ◽  
pp. e0155461 ◽  
Author(s):  
José M. Abuín ◽  
Juan C. Pichel ◽  
Tomás F. Pena ◽  
Jorge Amigo

Sign in / Sign up

Export Citation Format

Share Document