scholarly journals Enabling metagenomic surveillance for bacterial tick-borne pathogens using nanopore sequencing with adaptive sampling

2021 ◽  
Author(s):  
Evan J. Kipp ◽  
Laramie L. Lindsey ◽  
Benedict S. Khoo ◽  
Christopher Faulk ◽  
Jonathan D. Oliver ◽  
...  

Technological and computational advancements in the fields of genomics and bioinformatics are providing exciting new opportunities for pathogen discovery and surveillance. In particular, single-molecule nucleotide sequence data originating from Oxford Nanopore Technologies (ONT) sequencing platforms can be bioinformatically leveraged, in real-time, for enhanced biosurveillance of a vast array of zoonoses. The recently released nanopore adaptive sampling (NAS) pipeline facilitates immediate mapping of individual nucleotide molecules (i.e., DNA, cDNA, and RNA) to a given reference as each molecule is sequenced. User-defined thresholds then allow for the retention or rejection of specific molecules, informed by the real-time reference mapping results, as they are physically passing through a given sequencing nanopore. Here, we show how NAS can be used to selectively sequence entire genomes of bacterial tick-borne pathogens circulating in wild populations of the blacklegged tick vector, Ixodes scapularis. The NAS method provided a two-fold increase in targeted pathogen sequences, successfully enriching for Borrelia (Borreliella) burgdorferi s.s.; Borrelia (Borrelia) miyamotoi; Anaplasma phagocytophilum; and Ehrlichia muris eauclairensis genomic DNA within our I. scapularis samples. Our results indicate that NAS has strong potential for real-time sequence-based pathogen surveillance.

2013 ◽  
Vol 5 ◽  
pp. BECB.S10886 ◽  
Author(s):  
Brijesh Singh Yadav ◽  
Venkateswarlu Ronda ◽  
Dinesh P. Vashista ◽  
Bhaskar Sharma

The recent advances in sequencing technologies and computational approaches are propelling scientists ever closer towards complete understanding of human-microbial interactions. The powerful sequencing platforms are rapidly producing huge amounts of nucleotide sequence data which are compiled into huge databases. This sequence data can be retrieved, assembled, and analyzed for identification of microbial pathogens and diagnosis of diseases. In this article, we present a commentary on how the metagenomics incorporated with microarray and new sequencing techniques are helping microbial detection and characterization.


2016 ◽  
Vol 7 (6) ◽  
pp. 1230-1235 ◽  
Author(s):  
Christine B. Graham ◽  
Mark A. Pilgard ◽  
Sarah E. Maes ◽  
Andrias Hojgaard ◽  
Rebecca J. Eisen

F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 297 ◽  
Author(s):  
Jason R. Miller ◽  
Sergey Koren ◽  
Kari A. Dilley ◽  
Derek M. Harkins ◽  
Timothy B. Stockwell ◽  
...  

Background:The tick cell line ISE6, derived fromIxodes scapularis, is commonly used for amplification and detection of arboviruses in environmental or clinical samples.Methods:To assist with sequence-based assays, we sequenced the ISE6 genome with single-molecule, long-read technology.Results:The draft assembly appears near complete based on gene content analysis, though it appears to lack some instances of repeats in this highly repetitive genome. The assembly appears to have separated the haplotypes at many loci. DNA short read pairs, used for validation only, mapped to the cell line assembly at a higher rate than they mapped to theIxodes scapularisreference genome sequence.Conclusions:The assembly could be useful for filtering host genome sequence from sequence data obtained from cells infected with pathogens.


Author(s):  
Daniel H. Goldman ◽  
Nathan M. Livingston ◽  
Jonathan Movsik ◽  
Bin Wu ◽  
Rachel Green

AbstractTranslation of problematic mRNA sequences induces ribosome stalling. Collided ribosomes at the stall site are recognized by cellular quality control machinery, resulting in dissociation of the ribosome from the mRNA and subsequent degradation of the nascent polypeptide and in some organisms, decay of the mRNA. However, the timing and regulation of these processes are unclear. We developed a SunTag-based reporter to monitor translation in real time on single mRNAs harboring difficult-to-translate poly(A) stretches. This reporter recapitulates previous findings in human cells that an internal poly(A) stretch reduces protein output ∼10-fold, while mRNA levels are relatively unaffected. Long-term imaging of translation indicates that poly(A)-containing mRNAs are robustly translated in the absence of detectable mRNA cleavage. However, quantification of ribosome density reveals a ∼3-fold increase in the number of ribosomes on poly(A)-containing mRNAs compared to a control mRNA, consistent with queues of many stalled ribosomes. Using single-molecule harringtonine runoff experiments, we observe the resolution of these queues in real-time by the cellular quality control machinery, and find that rescue is very slow compared to both elongation and termination. We propose that the very slow clearance of stalled ribosomes provides the basis for the cell to distinguish between transient and deleterious stalls, and that the human quality control apparatus predominantly targets the nascent protein rather than the mRNA.


2021 ◽  
Author(s):  
Yuwei Bao ◽  
Jack Wadden ◽  
John R. Erb-Downward ◽  
Piyush Ranjan ◽  
Robert P. Dickson ◽  
...  

AbstractSingle-molecule sequencers made by Oxford Nanopore provide results in real time as DNA passes through a nanopore and can eject a molecule after it has been partly sequenced. However, the computational challenge of deciding whether to keep or reject a molecule in real time has limited the application of this capability. We present SquiggleNet, the first deep learning model that can classify nanopore reads directly from their electrical signals. SquiggleNet operates faster than the DNA passes through the pore, allowing real-time classification and read ejection. When given the amount of sequencing data generated in one second, the classifier achieves significantly higher accuracy than base calling followed by sequence alignment. Our approach is also faster and requires an order of magnitude less memory than approaches based on alignment. SquiggleNet distinguished human from bacterial DNA with over 90% accuracy across test datasets from different flowcells and sample preparations, generalized to unseen species, and identified bacterial species in a human respiratory meta genome sample.


GigaScience ◽  
2020 ◽  
Vol 9 (9) ◽  
Author(s):  
Mengyang Xu ◽  
Lidong Guo ◽  
Shengqiang Gu ◽  
Ou Wang ◽  
Rui Zhang ◽  
...  

Abstract Background Analyses that use genome assemblies are critically affected by the contiguity, completeness, and accuracy of those assemblies. In recent years single-molecule sequencing techniques generating long-read information have become available and enabled substantial improvement in contig length and genome completeness, especially for large genomes (>100 Mb), although bioinformatic tools for these applications are still limited. Findings We developed a software tool to close sequence gaps in genome assemblies, TGS-GapCloser, that uses low-depth (∼10×) long single-molecule reads. The algorithm extracts reads that bridge gap regions between 2 contigs within a scaffold, error corrects only the candidate reads, and assigns the best sequence data to each gap. As a demonstration, we used TGS-GapCloser to improve the scaftig NG50 value of 3 human genome assemblies by 24-fold on average with only ∼10× coverage of Oxford Nanopore or Pacific Biosciences reads, covering with sequence data up to 94.8% gaps with 97.7% positive predictive value. These improved assemblies achieve 99.998% (Q46) single-base accuracy with final inserted sequences having 99.97% (Q35) accuracy, despite the high raw error rate of single-molecule reads, enabling high-quality downstream analyses, including up to a 31-fold increase in the scaftig NGA50 and up to 13.1% more complete BUSCO genes. Additionally, we show that even in ultra-large genome assemblies, such as the ginkgo (∼12 Gb), TGS-GapCloser can cover 71.6% of gaps with sequence data. Conclusions TGS-GapCloser can close gaps in large genome assemblies using raw long reads quickly and cost-effectively. The final assemblies generated by TGS-GapCloser have improved contiguity and completeness while maintaining high accuracy. The software is available at https://github.com/BGI-Qingdao/TGS-GapCloser.


2013 ◽  
Vol 368 (1614) ◽  
pp. 20120202 ◽  
Author(s):  
Nicholas J. Croucher ◽  
Simon R. Harris ◽  
Yonatan H. Grad ◽  
William P. Hanage

Sequence data are well established in the reconstruction of the phylogenetic and demographic scenarios that have given rise to outbreaks of viral pathogens. The application of similar methods to bacteria has been hindered in the main by the lack of high-resolution nucleotide sequence data from quality samples. Developing and already available genomic methods have greatly increased the amount of data that can be used to characterize an isolate and its relationship to others. However, differences in sequencing platforms and data analysis mean that these enhanced data come with a cost in terms of portability: results from one laboratory may not be directly comparable with those from another. Moreover, genomic data for many bacteria bear the mark of a history including extensive recombination, which has the potential to greatly confound phylogenetic and coalescent analyses. Here, we discuss the exacting requirements of genomic epidemiology, and means by which the distorting signal of recombination can be minimized to permit the leverage of growing datasets of genomic data from bacterial pathogens.


2019 ◽  
Author(s):  
Nicolas Cardozo ◽  
Karen Zhang ◽  
Katie Doroschak ◽  
Aerilynn Nguyen ◽  
Zoheb Siddiqui ◽  
...  

AbstractGenetically encoded reporter proteins are a cornerstone of molecular biology. While they are widely used to measure many biological activities, the current number of uniquely addressable reporters that can be used together for one-pot multiplexed tracking is small due to overlapping detection channels such as fluorescence. To address this, we built an expanded library of orthogonally-barcoded Nanopore-addressable protein Tags Engineered as Reporters (NanoporeTERs), which can be read and demuxed by nanopore sensors at the single-molecule level. By adapting a commercially available nanopore sensor array platform typically used for real-time DNA and RNA sequencing (Oxford Nanopore Technologies’ MinION), we show direct detection of NanoporeTER expression levels from unprocessed bacterial culture with no specialized sample preparation. These results lay the foundations for a new class of reporter proteins to enable multiplexed, real-time tracking of gene expression with nascent nanopore sensor technology.


2015 ◽  
Author(s):  
Sissel Juul ◽  
Fernando Izquierdo ◽  
Adam Hurst ◽  
Xiaoguang Dai ◽  
Amber Wright ◽  
...  

Whole genome sequencing on next-generation instruments provides an unbiased way to identify the organisms present in complex metagenomic samples. However, the time-to-result can be protracted because of fixed-time sequencing runs and cumbersome bioinformatics workflows. This limits the utility of the approach in settings where rapid species identification is crucial, such as in the quality control of food-chain components, or in during an outbreak of an infectious disease. Here we present What′s in my Pot? (WIMP), a laboratory and analysis workflow in which, starting with an unprocessed sample, sequence data is generated and bacteria, viruses and fungi present in the sample are classified to subspecies and strain level in a quantitative manner, without prior knowledge of the sample composition, in approximately 3.5 hours. This workflow relies on the combination of Oxford Nanopore Technologies′ MinION ™ sensing device with a real-time species identification bioinformatics application.


2017 ◽  
Author(s):  
Felix Francis ◽  
Michael D. Dumas ◽  
Scott B. Davis ◽  
Randall J. Wisser

BACKGROUNDTargeted resequencing with high-throughput sequencing (HTS) platforms can be used to efficiently interrogate the genomes of large numbers of individuals. A critical challenge for research and applications using HTS data, especially from long-read platforms, is errors arising from technological limits and bioinformatic algorithms.RESULTSA single molecule real-time (SMRT) sequencing-error correction and assembly pipeline, C3S-LAA, was developed for libraries of pooled amplicons. By uniquely leveraging the structure of SMRT sequence data (comprised of multiple low quality subreads from which higher quality circular consensus sequences are formed) to cluster raw reads, C3S-LAA produced accurate consensus sequences and assemblies of overlapping amplicons from single sample and multiplexed libraries. In contrast, despite read depths in excess of 100X per amplicon, the standard long amplicon analysis module from Pacific Biosciences generated unexpected numbers of amplicon sequences with substantial inaccuracies in the consensus sequences. A bootstrap analysis showed that the C3S-LAA pipeline per se was effective at removing bioinformatic sources of error, but in rare cases a read depth of nearly 400X was not sufficient to overcome minor but systematic errors inherent to amplification or sequencing.CONCLUSIONSC3S-LAA uses a novel processing algorithm for SMRT amplicon-sequence data that produces accurate consensus sequences and local sequence assemblies. The community standard long amplicon analysis module from Pacific Biosciences is prone to substantial errors that raise concerns about findings based on this pipeline. The method developed here removed this confounding bioinformatics source of error, allowing for the identification of limited instances of errors due to DNA amplification or sequencing.


Sign in / Sign up

Export Citation Format

Share Document