base quality score
Recently Published Documents


TOTAL DOCUMENTS

7
(FIVE YEARS 3)

H-INDEX

1
(FIVE YEARS 0)

PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0244471
Author(s):  
Charlotte Herzeel ◽  
Pascal Costanza ◽  
Dries Decap ◽  
Jan Fostier ◽  
Roel Wuyts ◽  
...  

We present elPrep 5, which updates the elPrep framework for processing sequencing alignment/map files with variant calling. elPrep 5 can now execute the full pipeline described by the GATK Best Practices for variant calling, which consists of PCR and optical duplicate marking, sorting by coordinate order, base quality score recalibration, and variant calling using the haplotype caller algorithm. elPrep 5 produces identical BAM and VCF output as GATK4 while significantly reducing the runtime by parallelizing and merging the execution of the pipeline steps. Our benchmarks show that elPrep 5 speeds up the runtime of the variant calling pipeline by a factor 8-16x on both whole-exome and whole-genome data while using the same hardware resources as GATK4. This makes elPrep 5 a suitable drop-in replacement for GATK4 when faster execution times are needed.


2020 ◽  
Author(s):  
Olga Krasheninina ◽  
Yih-Chii Hwang ◽  
Xiaodong Bai ◽  
Aleksandra Zalcman ◽  
Evan Maxwell ◽  
...  

AbstractStandardized genome informatics protocols minimize reprocessing costs and facilitate harmonization across studies if implemented in a transparent, accessible and reproducible manner. Here we define the OQFE protocol, a lossless read-mapping protocol that retains key features of existing NGS standard methods. We demonstrate that variants can be called directly from NovaSeq OQFE data without the need for base quality score recalibration and describe a large-scale variant calling protocol for OQFE data. The OQFE protocol is open-source and a containerized implementation is provided.


2020 ◽  
Author(s):  
Charlotte Herzeel ◽  
Pascal Costanza ◽  
Dries Decap ◽  
Jan Fostier ◽  
Roel Wuyts ◽  
...  

AbstractWe present elPrep 5, which updates the elPrep framework for processing sequencing alignment/map files with variant calling. elPrep 5 can now execute the full pipeline described by the GATK Best Practices for variant calling, which consists of PCR and optical duplicate marking, sorting by coordinate order, base quality score recalibration, and variant calling using the haplotype caller algorithm. elPrep 5 produces identical BAM and VCF output as GATK4 while significantly reducing the runtime by parallelizing and merging the execution of the pipeline steps. Our benchmarks show that elPrep 5 speeds up the runtime of the variant calling pipeline by a factor 8-16x on both whole-exome and whole-genome data while using the same hardware resources as GATK 4. This makes elPrep 5 a suitable drop-in replacement for GATK 4 when faster execution times are needed.


2018 ◽  
Author(s):  
Charlotte Herzeel ◽  
Pascal Costanza ◽  
Dries Decap ◽  
Jan Fostier ◽  
Wilfried Verachtert

We present elPrep 4, a reimplementation from scratch of the elPrep framework for processing sequence alignment map files in the Go programming language. elPrep 4 includes multiple new features allowing us to process all of the preparation steps defined by the GATK Best Practice pipelines for variant calling. This includes new and improved functionality for sorting, (optical) duplicate marking, base quality score recalibration, BED and VCF parsing, and various filtering options. The implementations of these options in elPrep 4 faithfully reproduce the outcomes of their counterparts in GATK 4, SAMtools, and Picard, even though the underlying algorithms are redesigned to take advantage of elPrep's parallel execution framework to vastly improve the runtime and resource use compared to these tools. Our benchmarks show that elPrep executes the preparation steps of the GATK Best Practices up to 13x faster on WES data, and up to 7.4x faster for WGS data compared to running the same pipeline with GATK 4, while utilizing fewer compute resources.


2017 ◽  
Author(s):  
Jade C.S. Chung ◽  
Swaine L. Chen

AbstractNext-generation sequencing data is accompanied by quality scores that quantify sequencing error. Inaccuracies in these quality scores propagate through all subsequent analyses; thus base quality score recalibration is a standard step in many next-generation sequencing workflows, resulting in improved variant calls. Current base quality score recalibration algorithms rely on the assumption that sequencing errors are already known; for human resequencing data, relatively complete variant databases facilitate this. However, because existing databases are still incomplete, recalibration is still inaccurate; and most organisms do not have variant databases, exacerbating inaccuracy for non-human data. To overcome these logical and practical problems, we introduce Lacer, which recalibrates base quality scores without assuming knowledge of correct and incorrect bases and without requiring knowledge of common variants. Lacer is the first logically sound, fully general, and truly accurate base recalibrator. Lacer enhances variant identification accuracy for resequencing data of human as well as other organisms (which are not accessible to current recalibrators), simultaneously improving and extending the benefits of base quality score recalibration to nearly all ongoing sequencing projects. Lacer is available at: https://github.com/swainechen/lacer.


2015 ◽  
Vol 16 (Suppl 5) ◽  
pp. S8 ◽  
Author(s):  
Xiaoqing Peng ◽  
Jianxin Wang ◽  
Zhen Zhang ◽  
Qianghua Xiao ◽  
Min Li ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document