scholarly journals Determining the serotype composition of mixed samples of pneumococcus using whole genome sequencing

2019 ◽  
Author(s):  
James R. Knight ◽  
Eileen M. Dunne ◽  
E. Kim Mulholland ◽  
Sudipta Saha ◽  
Catherine Satzke ◽  
...  

ABSTRACTSerotyping of Streptococcus pneumoniae is a critical tool in the surveillance of the pathogen and development and evaluation of vaccines. Whole-genome DNA sequencing and analysis is becoming increasingly common and is an effective method for pneumococcal serotype identification of pure isolates. However, because of the complexities of the pneumococcal capsular loci, current analysis software requires samples to be pure (or nearly pure) and only contain a single pneumococcal serotype. We introduce a new software tool called SeroCall, which can identify and quantitate the serotypes present in samples, even when several serotypes are present. The sample preparation, library preparation and sequencing follow standard laboratory protocols. The software runs as fast or faster than existing identification tools on typical computing servers and is freely available under an open source license at https://github.com/knightjimr/serocall. Using samples with known concentrations of different serotypes as well as blinded samples, we were able to accurately quantify the abundance of different serotypes of pneumococcus in mixed cultures, with 100% accuracy for detecting the major serotype and up to 86% accuracy for detecting minor serotypes. We were also able to track changes in serotype frequency over time in an experimental setting. This approach could be applied in both epidemiologic field studies of pneumococcal colonization as well as in experimental lab studies and could provide a cheaper and more efficient method for serotyping than alternative approaches.

2020 ◽  
Author(s):  
James R. Knight ◽  
Eileen M. Dunne ◽  
E. Kim Mulholland ◽  
Sudipta Saha ◽  
Catherine Satzke ◽  
...  

Serotyping of Streptococcus pneumoniae is a critical tool in the surveillance of the pathogen and in the development and evaluation of vaccines. Whole-genome DNA sequencing and analysis is becoming increasingly common and is an effective method for pneumococcal serotype identification of pure isolates. However, because of the complexities of the pneumococcal capsular loci, current analysis software requires samples to be pure (or nearly pure) and only contain a single pneumococcal serotype. We introduce a new software tool called SeroCall, which can identify and quantitate the serotypes present in samples, even when several serotypes are present. The sample preparation, library preparation and sequencing follow standard laboratory protocols. The software runs as fast as or faster than existing identification tools on typical computing servers and is freely available under an open source licence at https://github.com/knightjimr/serocall. Using samples with known concentrations of different serotypes as well as blinded samples, we were able to accurately quantify the abundance of different serotypes of pneumococcus in mixed cultures, with 100 % accuracy for detecting the major serotype and up to 86 % accuracy for detecting minor serotypes. We were also able to track changes in serotype frequency over time in an experimental setting. This approach could be applied in both epidemiological field studies of pneumococcal colonization and experimental laboratory studies, and could provide a cheaper and more efficient method for serotyping than alternative approaches.


2008 ◽  
Vol 5 (4) ◽  
pp. 319-322 ◽  
Author(s):  
Sung Kyu Park ◽  
John D Venable ◽  
Tao Xu ◽  
John R Yates

2012 ◽  
Vol 63 (8) ◽  
pp. 1609-1630 ◽  
Author(s):  
M.J. Cobo ◽  
A.G. López-Herrera ◽  
E. Herrera-Viedma ◽  
F. Herrera

2016 ◽  
Author(s):  
Egor Dolzhenko ◽  
Joke J.F.A. van Vugt ◽  
Richard J. Shaw ◽  
Mitchell A. Bekritsky ◽  
Marka van Blitterswijk ◽  
...  

AbstractIdentifying large repeat expansions such as those that cause amyotrophic lateral sclerosis (ALS) and Fragile X syndrome is challenging for short-read (100-150 bp) whole genome sequencing (WGS) data. A solution to this problem is an important step towards integrating WGS into precision medicine. We have developed a software tool called ExpansionHunter that, using PCR-free WGS short-read data, can genotype repeats at the locus of interest, even if the expanded repeat is larger than the read length. We applied our algorithm to WGS data from 3,001 ALS patients who have been tested for the presence of the C9orf72 repeat expansion with repeat-primed PCR (RP-PCR). Taking the RP-PCR calls as the ground truth, our WGS-based method identified pathogenic repeat expansions with 98.1% sensitivity and 99.7% specificity. Further inspection identified that all 11 conflicts were resolved as errors in the original RP-PCR results. Compared against this updated result, ExpansionHunter correctly classified all (212/212) of the expanded samples as either expansions (208) or potential expansions (4). Additionally, 99.9% (2,786/2,789) of the wild type samples were correctly classified as wild type by this method with the remaining two identified as possible expansions. We further applied our algorithm to a set of 144 samples where every sample had one of eight different pathogenic repeat expansions including examples associated with fragile X syndrome, Friedreich’s ataxia and Huntington’s disease and correctly flagged all of the known repeat expansions. Finally, we tested the accuracy of our method for short repeats by comparing our genotypes with results from 860 samples sized using fragment length analysis and determined that our calls were >95% accurate. ExpansionHunter can be used to accurately detect known pathogenic repeat expansions and provides researchers with a tool that can be used to identify new pathogenic repeat expansions.


2019 ◽  
Author(s):  
Lauren Marazzi ◽  
Andrew Gainer-Dewar ◽  
Paola Vera-Licona

AbstractSummaryOCSANA+ is a Cytoscape app for identifying nodes to drive the system towards a desired long-term behavior, prioritizing combinations of interventions in large scale complex networks, and estimating the effects of node perturbations in signaling networks, all based on the analysis of the network’s structure. OCSANA+ includes an update to OCSANA (optimal combinations of interventions from network analysis) software tool with cutting-edge and rigorously tested algorithms, together with recently-developed structure-based control algorithms for non-linear systems and an algorithm for estimating signal flow. All these algorithms are based on the network’s topology. OCSANA+ is implemented as a Cytoscape app to enable a user interface for running analyses and visualizing results.Availability and ImplementationOCSANA+ app and its tutorial can be downloaded from the Cytoscape App Store or https://veraliconaresearchgroup.github.io/OCSANA-Plus/. The source code and computations are available in https://github.com/VeraLiconaResearchGroup/OCSANA-Plus_SourceCode.


2016 ◽  
Author(s):  
Ryan L. Collins ◽  
Matthew R. Stone ◽  
Harrison Brand ◽  
Joseph T. Glessner ◽  
Michael E. Talkowski

AbstractSummaryCopy number variation (CNV) is a major component of structural differences between individual genomes. The recent emergence of population-scale whole-genome sequencing (WGS) datasets has enabled genome-wide CNV delineation. However, molecular validation at this scale is impractical, so visualization is an invaluable preliminary screening approach when evaluating CNVs. Standardized tools for visualization of CNVs in large WGS datasets are therefore in wide demand.Methods & ResultsTo address this demand, we developed a software tool, CNView, for normalized visualization, statistical scoring, and annotation of CNVs from population-scale WGS datasets. CNView surmounts challenges of sequencing depth variability between individual libraries by locally adapting to cohort-wide variance in sequencing uniformity at any locus. Importantly, CNView is broadly extensible to any reference genome assembly and most current WGS data types.Availability and ImplementationCNView is written in R, is supported on OS X, MS Windows, and Linux, and is freely distributed under the MIT license. Source code and documentation are available from https://github.com/RCollins13/[email protected]


Sign in / Sign up

Export Citation Format

Share Document