scholarly journals QC-Chain: Fast and Holistic Quality Control Method for Next-Generation Sequencing Data

PLoS ONE ◽  
2013 ◽  
Vol 8 (4) ◽  
pp. e60234 ◽  
Author(s):  
Qian Zhou ◽  
Xiaoquan Su ◽  
Anhui Wang ◽  
Jian Xu ◽  
Kang Ning
2014 ◽  
Vol 5 ◽  
Author(s):  
Urmi H. Trivedi ◽  
Timothée Cézard ◽  
Stephen Bridgett ◽  
Anna Montazam ◽  
Jenna Nichols ◽  
...  

2018 ◽  
Vol 3 ◽  
pp. 36 ◽  
Author(s):  
Márton Münz ◽  
Shazia Mahamdallie ◽  
Shawn Yost ◽  
Andrew Rimmer ◽  
Emma Poyastro-Pearson ◽  
...  

Quality assurance and quality control are essential for robust next generation sequencing (NGS). Here we present CoverView, a fast, flexible, user-friendly quality evaluation tool for NGS data. CoverView processes mapped sequencing reads and user-specified regions to report depth of coverage, base and mapping quality metrics with increasing levels of detail from a chromosome-level summary to per-base profiles. CoverView can flag regions that do not fulfil user-specified quality requirements, allowing suboptimal data to be systematically and automatically presented for review. It also provides an interactive graphical user interface (GUI) that can be opened in a web browser and allows intuitive exploration of results. We have integrated CoverView into our accredited clinical cancer predisposition gene testing laboratory that uses the TruSight Cancer Panel (TSCP). CoverView has been invaluable for optimisation and quality control of our testing pipeline, providing transparent, consistent quality metric information and automatic flagging of regions that fall below quality thresholds. We demonstrate this utility with TSCP data from the Genome in a Bottle reference sample, which CoverView analysed in 13 seconds. CoverView uses data routinely generated by NGS pipelines, reads standard input formats, and rapidly creates easy-to-parse output text (.txt) files that are customised by a simple configuration file. CoverView can therefore be easily integrated into any NGS pipeline. CoverView and detailed documentation for its use are freely available at github.com/RahmanTeamDevelopment/CoverView/releases and www.icr.ac.uk/CoverView


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Eric M. Davis ◽  
Yu Sun ◽  
Yanling Liu ◽  
Pandurang Kolekar ◽  
Ying Shao ◽  
...  

Abstract Background There is currently no method to precisely measure the errors that occur in the sequencing instrument/sequencer, which is critical for next-generation sequencing applications aimed at discovering the genetic makeup of heterogeneous cellular populations. Results We propose a novel computational method, SequencErr, to address this challenge by measuring the base correspondence between overlapping regions in forward and reverse reads. An analysis of 3777 public datasets from 75 research institutions in 18 countries revealed the sequencer error rate to be ~ 10 per million (pm) and 1.4% of sequencers and 2.7% of flow cells have error rates > 100 pm. At the flow cell level, error rates are elevated in the bottom surfaces and > 90% of HiSeq and NovaSeq flow cells have at least one outlier error-prone tile. By sequencing a common DNA library on different sequencers, we demonstrate that sequencers with high error rates have reduced overall sequencing accuracy, and removal of outlier error-prone tiles improves sequencing accuracy. We demonstrate that SequencErr can reveal novel insights relative to the popular quality control method FastQC and achieve a 10-fold lower error rate than popular error correction methods including Lighter and Musket. Conclusions Our study reveals novel insights into the nature of DNA sequencing errors incurred on DNA sequencers. Our method can be used to assess, calibrate, and monitor sequencer accuracy, and to computationally suppress sequencer errors in existing datasets.


2019 ◽  
Vol 15 (12) ◽  
pp. e1007556
Author(s):  
Jiajin Li ◽  
Brandon Jew ◽  
Lingyu Zhan ◽  
Sungoo Hwang ◽  
Giovanni Coppola ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document