Performance Optimization of a Parallel Error Correction Tool

Due to the continuous development in the field of Next Generation Sequencing (NGS) technologies that have allowed researchers to take advantage of greater genetic samples in less time, it is a matter of relevance to improve the existing algorithms aimed at the enhancement of the quality of those generated reads. In this work, we present a Big Data tool implemented upon the open-source Apache Spark framework that is able to execute validated error-correction algorithms at an improved performance. The experimental evaluation conducted on a multi-core cluster has shown significant improvements in execution times, providing a maximum speedup of 9.5 over existing error correction tools when processing an NGS dataset with 25 million reads.

Download Full-text

Assessment of the quality of DNA from various formalin-fixed paraffin-embedded (FFPE) tissues and the use of this DNA for next-generation sequencing (NGS) with no artifactual mutation

PLoS ONE ◽

10.1371/journal.pone.0176280 ◽

2017 ◽

Vol 12 (5) ◽

pp. e0176280 ◽

Cited By ~ 37

Author(s):

Naoki Einaga ◽

Akio Yoshida ◽

Hiroko Noda ◽

Masaaki Suemitsu ◽

Yuki Nakayama ◽

...

Keyword(s):

Next Generation Sequencing ◽

Next Generation ◽

Formalin Fixed Paraffin ◽

Formalin Fixed Paraffin Embedded ◽

Ffpe Tissues ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing ◽

Formalin Fixed

Download Full-text

Review of Clinical Next-Generation Sequencing

Archives of Pathology & Laboratory Medicine ◽

10.5858/arpa.2016-0501-ra ◽

2017 ◽

Vol 141 (11) ◽

pp. 1544-1557 ◽

Cited By ~ 87

Author(s):

Sophia Yohe ◽

Bharat Thyagarajan

Keyword(s):

Next Generation Sequencing ◽

Clinical Care ◽

Cost Effective ◽

Next Generation ◽

Inherited Disorders ◽

Academic Center ◽

Next Generation Sequencing Ngs ◽

Ngs Data ◽

Generation Sequencing

Context.— Next-generation sequencing (NGS) is a technology being used by many laboratories to test for inherited disorders and tumor mutations. This technology is new for many practicing pathologists, who may not be familiar with the uses, methodology, and limitations of NGS. Objective.— To familiarize pathologists with several aspects of NGS, including current and expanding uses; methodology including wet bench aspects, bioinformatics, and interpretation; validation and proficiency; limitations; and issues related to the integration of NGS data into patient care. Data Sources.— The review is based on peer-reviewed literature and personal experience using NGS in a clinical setting at a major academic center. Conclusions.— The clinical applications of NGS will increase as the technology, bioinformatics, and resources evolve to address the limitations and improve quality of results. The challenge for clinical laboratories is to ensure testing is clinically relevant, cost-effective, and can be integrated into clinical care.

Download Full-text

Assuring the Quality of Next-Generation Sequencing in Clinical Microbiology and Public Health Laboratories

Journal of Clinical Microbiology ◽

10.1128/jcm.00949-16 ◽

2016 ◽

Vol 54 (12) ◽

pp. 2857-2865 ◽

Cited By ~ 62

Author(s):

Amy S. Gargis ◽

Lisa Kalman ◽

Ira M. Lubin

Keyword(s):

Public Health ◽

Next Generation Sequencing ◽

Clinical Microbiology ◽

Next Generation ◽

Standards And Guidelines ◽

Control Procedures ◽

Next Generation Sequencing Ngs ◽

Professional Guidelines ◽

Generation Sequencing

Clinical microbiology and public health laboratories are beginning to utilize next-generation sequencing (NGS) for a range of applications. This technology has the potential to transform the field by providing approaches that will complement, or even replace, many conventional laboratory tests. While the benefits of NGS are significant, the complexities of these assays require an evolving set of standards to ensure testing quality. Regulatory and accreditation requirements, professional guidelines, and best practices that help ensure the quality of NGS-based tests are emerging. This review highlights currently available standards and guidelines for the implementation of NGS in the clinical and public health laboratory setting, and it includes considerations for NGS test validation, quality control procedures, proficiency testing, and reference materials.

Download Full-text

NGseqBasic - a single-command UNIX tool for ATAC-seq, DNaseI-seq, Cut-and-Run, and ChIP-seq data mapping, high-resolution visualisation, and quality control

10.1101/393413 ◽

2018 ◽

Cited By ~ 8

Author(s):

Jelena Telenius ◽

Jim R. Hughes ◽

Keyword(s):

Quality Control ◽

Big Data ◽

Next Generation Sequencing ◽

High Resolution ◽

Data Processing ◽

Version Control ◽

Data Set ◽

Genome Group ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

ABSTRACTWith decreasing cost of next-generation sequencing (NGS), we are observing a rapid rise in the volume of ‘big data’ in academic research, healthcare and drug discovery sectors. The present bottleneck for extracting value from these ‘big data’ sets is data processing and analysis. Considering this, there is still a lack of reliable, automated and easy to use tools that will allow experimentalists to assess the quality of the sequenced libraries and explore the data first hand, without the need of investing a lot of time of computational core analysts in the early stages of analysis.NGseqBasic is an easy-to-use single-command analysis tool for chromatin accessibility (ATAC, DNaseI) and ChIP sequencing data, providing support to also new techniques such as low cell number sequencing and Cut-and-Run. It takes in fastq, fastq.gz or bam files, conducts all quality control, trimming and mapping steps, along with quality control and data processing statistics, and combines all this to a single-click loadable UCSC data hub, with integral statistics html page providing detailed reports from the analysis tools and quality control metrics. The tool is easy to set up, and no installation is needed. A wide variety of parameters are provided to fine-tune the analysis, with optional setting to generate DNase footprint or high resolution ChIP-seq tracks. A tester script is provided to help in the setup, along with a test data set and downloadable example user cases.NGseqBasic has been used in the routine analysis of next generation sequencing (NGS) data in high-impact publications 1,2. The code is actively developed, and accompanied with Git version control and Github code repository. Here we demonstrate NGseqBasic analysis and features using DNaseI-seq data from GSM689849, and CTCF-ChIP-seq data from GSM2579421, as well as a Cut-and-Run CTCF data set GSM2433142, and provide the one-click loadable UCSC data hubs generated by the tool, allowing for the ready exploration of the run results and quality control files generated by the tool.AvailabilityDownload, setup and help instructions are available on the NGseqBasic web site http://userweb.molbiol.ox.ac.uk/public/telenius/NGseqBasicManual/external/Bioconda users can load the tool as library “ngseqbasic”. The source code with Git version control is available in https://github.com/Hughes-Genome-Group/NGseqBasic/[email protected]

Download Full-text