Comparison Of Multi-locus Sequence Typing software For next generation sequencing data

ABSTRACTMulti-locus sequence typing (MLST) is a widely used method for categorising bacteria. Increasingly MLST is being performed using next generation sequencing data by reference labs and for clinical diagnostics. Many software applications have been developed to calculate sequence types from NGS data; however, there has been no comprehensive review to date on these methods. We have compared six of these applications against real and simulated data and present results on: 1. the accuracy of each method against traditional typing methods, 2. the performance on real outbreak datasets, 3. in the impact of contamination and varying depth of coverage, and 4. the computational resource requirements.DATA SUMMARYSimulated reads for datasets testing coverage and mixed samples have been deposited in Figshare; DOI:https://doi.org/10.6084/m9.figshare.4602301.vlOutbreak databases are available from Github; url -https://github.com/WGS-standards-and-analysis/datasetsDocker containers used to run each of the applications are available from Github; url –https://tinyurl.com/z7ks2ftAccession numbers for the data used in this paper are available in the Supplementary material.We confirm all supporting data, code and protocols have been provided within the article or through supplementary data files. ☒IMPACT STATEMENTSequence typing is rapidly transitioning from traditional sequencing methods to using whole genome sequencing. A number ofin silicoprediction methods have been developed on anad hocbasis and aim to replicate Multi-locus sequence typing (MLST). This is the first study to comprehensively evaluate multiple MLST software applications on real validated datasets and on common simulated difficult cases. It will give researchers a clearer understanding of the accuracy, limitations and computational performance of the methods they use, and will assist future researchers to choose the most appropriate method for their experimental goals.

Download Full-text

CoverView: a sequence quality evaluation tool for next generation sequencing data

Wellcome Open Research ◽

10.12688/wellcomeopenres.14306.1 ◽

2018 ◽

Vol 3 ◽

pp. 36 ◽

Cited By ~ 5

Author(s):

Márton Münz ◽

Shazia Mahamdallie ◽

Shawn Yost ◽

Andrew Rimmer ◽

Emma Poyastro-Pearson ◽

...

Keyword(s):

Quality Control ◽

Next Generation Sequencing ◽

Quality Evaluation ◽

Reference Sample ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Evaluation Tool ◽

Link Type ◽

Generation Sequencing

Quality assurance and quality control are essential for robust next generation sequencing (NGS). Here we present CoverView, a fast, flexible, user-friendly quality evaluation tool for NGS data. CoverView processes mapped sequencing reads and user-specified regions to report depth of coverage, base and mapping quality metrics with increasing levels of detail from a chromosome-level summary to per-base profiles. CoverView can flag regions that do not fulfil user-specified quality requirements, allowing suboptimal data to be systematically and automatically presented for review. It also provides an interactive graphical user interface (GUI) that can be opened in a web browser and allows intuitive exploration of results. We have integrated CoverView into our accredited clinical cancer predisposition gene testing laboratory that uses the TruSight Cancer Panel (TSCP). CoverView has been invaluable for optimisation and quality control of our testing pipeline, providing transparent, consistent quality metric information and automatic flagging of regions that fall below quality thresholds. We demonstrate this utility with TSCP data from the Genome in a Bottle reference sample, which CoverView analysed in 13 seconds. CoverView uses data routinely generated by NGS pipelines, reads standard input formats, and rapidly creates easy-to-parse output text (.txt) files that are customised by a simple configuration file. CoverView can therefore be easily integrated into any NGS pipeline. CoverView and detailed documentation for its use are freely available at github.com/RahmanTeamDevelopment/CoverView/releases and www.icr.ac.uk/CoverView

Download Full-text

Comparison of classical multi-locus sequence typing software for next-generation sequencing data

Microbial Genomics ◽

10.1099/mgen.0.000124 ◽

2017 ◽

Vol 3 (8) ◽

Cited By ~ 13

Author(s):

Andrew J. Page ◽

Nabil-Fareed Alikhan ◽

Heather A. Carleton ◽

Torsten Seemann ◽

Jacqueline A. Keane ◽

...

Keyword(s):

Next Generation Sequencing ◽

Multi Locus Sequence Typing ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing

Download Full-text

Pathogen-Host Analysis Tool (PHAT): an Integrative Platform to Analyze Pathogen-Host Relationships in Next-Generation Sequencing Data

10.1101/178327 ◽

2017 ◽

Author(s):

Christopher M. Gibb ◽

Robert Jackson ◽

Sabah Mohammed ◽

Jinan Fiaidhi ◽

Ingeborg Zehbe

Keyword(s):

Next Generation Sequencing ◽

Variant Calling ◽

Next Generation Sequencing Data ◽

Analysis Tool ◽

Next Generation ◽

Sequencing Data ◽

Link Type ◽

Reference File ◽

Host Relationships ◽

Generation Sequencing

AbstractSummaryThe Pathogen-Host Analysis Tool (PHAT) is an application for processing and analyzing next-generation sequencing (NGS) data as it relates to relationships between pathogen and host organisms. Unlike custom scripts and tedious pipeline programming, PHAT provides an integrative platform encompassing raw and aligned sequence and reference file input, quality control (QC) reporting, alignment and variant calling, linear and circular alignment viewing, and graphical and tabular output. This novel tool aims to be user-friendly for life scientists studying diverse pathogen-host relationships.Availability and ImplementationThe project is publicly available on GitHub (https://github.com/chgibb/PHAT) and includes convenient installers, as well as portable and source versions, for both Windows and Linux (Debian and RedHat). Up-to-date documentation for PHAT, including user guides and development notes, can be found at https://chgibb.github.io/PHATDocs/. We encourage users and developers to provide feedback (error reporting, suggestions, and comments) using GitHub Issues.ContactLead software developer: [email protected]

Download Full-text

Faculty Opinions recommendation of VarWalker: personalized mutation network analysis of putative cancer genes from next-generation sequencing data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718272765.793499663 ◽

2014 ◽

Author(s):

Gary Bader ◽

Mohamed Helmy

Keyword(s):

Next Generation Sequencing ◽

Network Analysis ◽

Next Generation Sequencing Data ◽

Cancer Genes ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing

Download Full-text

Faculty Opinions recommendation of Bioinformatory-assisted analysis of next-generation sequencing data for precision medicine in pancreatic cancer.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.727775566.793536095 ◽

2017 ◽

Author(s):

Steve Pereira

Keyword(s):

Pancreatic Cancer ◽

Next Generation Sequencing ◽

Precision Medicine ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Assisted Analysis ◽

Generation Sequencing

Download Full-text

NGSremix: A software tool for estimating pairwise relatedness between admixed individuals from next-generation sequencing data

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab174 ◽

2021 ◽

Author(s):

Anne Krogh Nøhr ◽

Kristian Hanghøj ◽

Genis Garcia Erill ◽

Zilong Li ◽

Ida Moltke ◽

...

Keyword(s):

Next Generation Sequencing ◽

Genetic Research ◽

Likelihood Estimation ◽

Software Tool ◽

Estimation Methods ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Ngs Data ◽

Generation Sequencing

Abstract Estimation of relatedness between pairs of individuals is important in many genetic research areas. When estimating relatedness, it is important to account for admixture if this is present. However, the methods that can account for admixture are all based on genotype data as input, which is a problem for low-depth next-generation sequencing (NGS) data from which genotypes are called with high uncertainty. Here we present a software tool, NGSremix, for maximum likelihood estimation of relatedness between pairs of admixed individuals from low-depth NGS data, which takes the uncertainty of the genotypes into account via genotype likelihoods. Using both simulated and real NGS data for admixed individuals with an average depth of 4x or below we show that our method works well and clearly outperforms all the commonly used state-of-the-art relatedness estimation methods PLINK, KING, relateAdmix, and ngsRelate that all perform quite poorly. Hence, NGSremix is a useful new tool for estimating relatedness in admixed populations from low-depth NGS data. NGSremix is implemented in C/C ++ in a multi-threaded software and is freely available on Github https://github.com/KHanghoj/NGSremix.

Download Full-text

recoup: flexible and versatile signal visualization from next generation sequencing

BMC Bioinformatics ◽

10.1186/s12859-020-03902-x ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Panagiotis Moulos

Keyword(s):

Next Generation Sequencing ◽

Next Generation Sequencing Data ◽

Special Focus ◽

Next Generation ◽

Sequencing Data ◽

User Friendliness ◽

Computational Environment ◽

Level Data ◽

Data Signal ◽

Generation Sequencing

Abstract Background The relentless continuing emergence of new genomic sequencing protocols and the resulting generation of ever larger datasets continue to challenge the meaningful summarization and visualization of the underlying signal generated to answer important qualitative and quantitative biological questions. As a result, the need for novel software able to reliably produce quick, comprehensive, and easily repeatable genomic signal visualizations in a user-friendly manner is rapidly re-emerging. Results recoup is a Bioconductor package for quick, flexible, versatile, and accurate visualization of genomic coverage profiles generated from Next Generation Sequencing data. Coupled with a database of precalculated genomic regions for multiple organisms, recoup offers processing mechanisms for quick, efficient, and multi-level data interrogation with minimal effort, while at the same time creating publication-quality visualizations. Special focus is given on plot reusability, reproducibility, and real-time exploration and formatting options, operations rarely supported in similar visualization tools in a profound way. recoup was assessed using several qualitative user metrics and found to balance the tradeoff between important package features, including speed, visualization quality, overall friendliness, and the reusability of the results with minimal additional calculations. Conclusion While some existing solutions for the comprehensive visualization of NGS data signal offer satisfying results, they are often compromised regarding issues such as effortless tracking of processing and preparation steps under a common computational environment, visualization quality and user friendliness. recoup is a unique package presenting a balanced tradeoff for a combination of assessment criteria while remaining fast and friendly.

Download Full-text