scholarly journals Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy

2007 ◽  
Vol 73 (16) ◽  
pp. 5261-5267 ◽  
Author(s):  
Qiong Wang ◽  
George M. Garrity ◽  
James M. Tiedje ◽  
James R. Cole

ABSTRACT The Ribosomal Database Project (RDP) Classifier, a naïve Bayesian classifier, can rapidly and accurately classify bacterial 16S rRNA sequences into the new higher-order taxonomy proposed in Bergey's Taxonomic Outline of the Prokaryotes (2nd ed., release 5.0, Springer-Verlag, New York, NY, 2004). It provides taxonomic assignments from domain to genus, with confidence estimates for each assignment. The majority of classifications (98%) were of high estimated confidence (≥95%) and high accuracy (98%). In addition to being tested with the corpus of 5,014 type strain sequences from Bergey's outline, the RDP Classifier was tested with a corpus of 23,095 rRNA sequences as assigned by the NCBI into their alternative higher-order taxonomy. The results from leave-one-out testing on both corpora show that the overall accuracies at all levels of confidence for near-full-length and 400-base segments were 89% or above down to the genus level, and the majority of the classification errors appear to be due to anomalies in the current taxonomies. For shorter rRNA segments, such as those that might be generated by pyrosequencing, the error rate varied greatly over the length of the 16S rRNA gene, with segments around the V2 and V4 variable regions giving the lowest error rates. The RDP Classifier is suitable both for the analysis of single rRNA sequences and for the analysis of libraries of thousands of sequences. Another related tool, RDP Library Compare, was developed to facilitate microbial-community comparison based on 16S rRNA gene sequence libraries. It combines the RDP Classifier with a statistical test to flag taxa differentially represented between samples. The RDP Classifier and RDP Library Compare are available online at http://rdp.cme.msu.edu/ .

2013 ◽  
Vol 79 (17) ◽  
pp. 5112-5120 ◽  
Author(s):  
James J. Kozich ◽  
Sarah L. Westcott ◽  
Nielson T. Baxter ◽  
Sarah K. Highlander ◽  
Patrick D. Schloss

ABSTRACTRapid advances in sequencing technology have changed the experimental landscape of microbial ecology. In the last 10 years, the field has moved from sequencing hundreds of 16S rRNA gene fragments per study using clone libraries to the sequencing of millions of fragments per study using next-generation sequencing technologies from 454 and Illumina. As these technologies advance, it is critical to assess the strengths, weaknesses, and overall suitability of these platforms for the interrogation of microbial communities. Here, we present an improved method for sequencing variable regions within the 16S rRNA gene using Illumina's MiSeq platform, which is currently capable of producing paired 250-nucleotide reads. We evaluated three overlapping regions of the 16S rRNA gene that vary in length (i.e., V34, V4, and V45) by resequencing a mock community and natural samples from human feces, mouse feces, and soil. By titrating the concentration of 16S rRNA gene amplicons applied to the flow cell and using a quality score-based approach to correct discrepancies between reads used to construct contigs, we were able to reduce error rates by as much as two orders of magnitude. Finally, we reprocessed samples from a previous study to demonstrate that large numbers of samples could be multiplexed and sequenced in parallel with shotgun metagenomes. These analyses demonstrate that our approach can provide data that are at least as good as that generated by the 454 platform while providing considerably higher sequencing coverage for a fraction of the cost.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2492 ◽  
Author(s):  
Catherine M. Burke ◽  
Aaron E. Darling

BackgroundThe bacterial 16S rRNA gene has historically been used in defining bacterial taxonomy and phylogeny. However, there are currently no high-throughput methods to sequence full-length 16S rRNA genes present in a sample with precision.ResultsWe describe a method for sequencing near full-length 16S rRNA gene amplicons using the high throughput Illumina MiSeq platform and test it using DNA from human skin swab samples. Proof of principle of the approach is demonstrated, with the generation of 1,604 sequences greater than 1,300 nt from a single Nano MiSeq run, with accuracy estimated to be 100-fold higher than standard Illumina reads. The reads were chimera filtered using information from a single molecule dual tagging scheme that boosts the signal available for chimera detection.ConclusionsThis method could be scaled up to generate many thousands of sequences per MiSeq run and could be applied to other sequencing platforms. This has great potential for populating databases with high quality, near full-length 16S rRNA gene sequences from under-represented taxa and environments and facilitates analyses of microbial communities at higher resolution.


2015 ◽  
Vol 65 (Pt_6) ◽  
pp. 1929-1934 ◽  
Author(s):  
Morgane Rossi-Tamisier ◽  
Samia Benamar ◽  
Didier Raoult ◽  
Pierre-Edouard Fournier

Modern bacterial taxonomy is based on a polyphasic approach that combines phenotypic and genotypic characteristics, including 16S rRNA sequence similarity. However, the 95 % (for genus) and 98.7 % (for species) sequence similarity thresholds that are currently recommended to classify bacterial isolates were defined by comparison of a limited number of bacterial species, and may not apply to many genera that contain human-associated species. For each of 158 bacterial genera containing human-associated species, we computed pairwise sequence similarities between all species that have names with standing in nomenclature and then analysed the results, considering as abnormal any similarity value lower than 95 % or greater than 98.7 %. Many of the current bacterial species with validly published names do not respect the 95 and 98.7 % thresholds, with 57.1 % of species exhibiting 16S rRNA gene sequence similarity rates ≥98.7 %, and 60.1 % of genera containing species exhibiting a 16S rRNA gene sequence similarity rate <95 %. In only 17 of the 158 genera studied (10.8 %), all species respected the 95 and 98.7 % thresholds. As we need powerful and reliable taxonomical tools, and as potential new tools such as pan-genomics have not yet been fully evaluated for taxonomic purposes, we propose to use as thresholds, genus by genus, the minimum and maximum similarity values observed among species.


2015 ◽  
Vol 5 (1) ◽  
Author(s):  
Kirsten A. Ziesemer ◽  
Allison E. Mann ◽  
Krithivasan Sankaranarayanan ◽  
Hannes Schroeder ◽  
Andrew T. Ozga ◽  
...  

Abstract To date, characterization of ancient oral (dental calculus) and gut (coprolite) microbiota has been primarily accomplished through a metataxonomic approach involving targeted amplification of one or more variable regions in the 16S rRNA gene. Specifically, the V3 region (E. coli 341–534) of this gene has been suggested as an excellent candidate for ancient DNA amplification and microbial community reconstruction. However, in practice this metataxonomic approach often produces highly skewed taxonomic frequency data. In this study, we use non-targeted (shotgun metagenomics) sequencing methods to better understand skewed microbial profiles observed in four ancient dental calculus specimens previously analyzed by amplicon sequencing. Through comparisons of microbial taxonomic counts from paired amplicon (V3 U341F/534R) and shotgun sequencing datasets, we demonstrate that extensive length polymorphisms in the V3 region are a consistent and major cause of differential amplification leading to taxonomic bias in ancient microbiome reconstructions based on amplicon sequencing. We conclude that systematic amplification bias confounds attempts to accurately reconstruct microbiome taxonomic profiles from 16S rRNA V3 amplicon data generated using universal primers. Because in silico analysis indicates that alternative 16S rRNA hypervariable regions will present similar challenges, we advocate for the use of a shotgun metagenomics approach in ancient microbiome reconstructions.


2003 ◽  
Vol 49 (1) ◽  
pp. 1-8 ◽  
Author(s):  
Achim Schmalenberger ◽  
Christoph C Tebbe

In this field study, we compared the bacterial communities inhabiting the rhizosphere of a transgenic, herbicide-resistant sugar beet (Beta vulgaris) cultivar with those of its nonengineered counterpart, using a genetic profiling technique based on PCR amplifications of partial 16S rRNA gene sequences and single-strand conformation polymorphism (SSCP). As a control for the plasticity of the bacterial community, we also analyzed the influence of herbicides, the field heterogeneity, and the annual variation. DNA was isolated from bacterial cell consortia that were directly collected from root material. PCR was carried out with primers that hybridized to evolutionarily conserved regions flanking variable regions 4 and 5 of the 16S rRNA gene. SSCP patterns of these PCR products were composed of approximately 50 distinguishable bands, as detected by silver staining of the gels after electrophoresis. Patterns of the replicates and the different treatments were highly similar, but digital image and similarity analyses revealed differences that corresponded to the positions of the replicates in the field. In addition, communities collected from sugar beet in two successive growing seasons could be distinguished. In contrast, no effect of the transgenic herbicide resistance was detectable. Sequencing of 24 dominant products of the SSCP profiles indicated the presence of bacteria from different phylogenetic groups, with Proteobacteria and members of the Cytophaga–Flavobacterium–Bacteroides group being most abundant.Key words: genetic profiles, rRNA genes, transgenic sugar beet, risk assessment, rhizosphere, PCR–SSCP, microbial community analysis, glufosinate, phosphinothricin.


2017 ◽  
Author(s):  
Garold Fuks ◽  
Michael Elgart ◽  
Amnon Amir ◽  
Amit Zeisel ◽  
Peter J. Turnbaugh ◽  
...  

AbstractBackgroundMost of our knowledge about the remarkable microbial diversity on Earth comes from sequencing the 16S rRNA gene. The use of next-generation sequencing methods has increased sample number and sequencing depth, but the read length of the most widely used sequencing platforms today is quite short, requiring the researcher to choose a subset of the gene to sequence (typically 16-33% of the total length). Thus, many bacteria may share the same amplified region and the resolution of profiling is inherently limited. Platforms that offer ultra long read lengths, whole genome shotgun sequencing approaches, and computational frameworks formerly suggested by us and by others, all allow different ways to circumvent this problem yet suffer various shortcomings. There is need for a simple and low cost 16S rRNA gene based profiling approach that harnesses the short read length to provide a much larger coverage of the gene to allow for high resolution, even in harsh conditions of low bacterial biomass and fragmented DNA.ResultsThis manuscript suggests Short MUltiple Regions Framework (SMURF), a method to combine sequencing results from different PCR-amplified regions to provide one coherent profiling. The de facto amplicon length is the total length of all amplified regions, thus providing much higher resolution compared to current techniques. Computationally, the method solves a convex optimization problem that allows extremely fast reconstruction and requires only moderate memory. We demonstrate the increase in resolution by in silico simulations and by profiling two mock mixtures and real-world biological samples. Reanalyzing a mock mixture from the Human Microbiome Project achieved about two-fold improvement in resolution when combing two independent regions. Using a custom set of six primer pairs spanning about 1200bp (80%) of the 16S rRNA gene we were able to achieve ~100 fold improvement in resolution compared to a single region, over a mock mixture of common human gut bacterial isolates. Finally, profiling of a Drosophila melanogaster microbiome using the set of six primer pairs provided a ~100 fold increase in resolution, and thus enabling efficient downstream analysis.ConclusionsSMURF enables identification of near full-length 16S rRNA gene sequences in microbial communities, having resolution superior compared to current techniques. It may be applied to standard sample preparation protocols with very little modifications. SMURF also paves the way to high-resolution profiling of low-biomass and fragmented DNA, e.g., in the case of Formalin-fixed and Paraffin-embedded samples, fossil-derived DNA or DNA exposed to other degrading conditions. The approach is not restricted to combining amplicons of the 16S rRNA gene and may be applied to any set of amplicons, e.g., in Multilocus Sequence Typing (MLST).


2014 ◽  
Author(s):  
Catherine Burke ◽  
Aaron E Darling

We describe a method for sequencing full-length 16S rRNA gene amplicons using the high throughput Illumina MiSeq platform. The resulting sequences have about 100-fold higher accuracy than standard Illumina reads and are chimera filtered using information from a single molecule dual tagging scheme that boosts the signal available for chimera detection. We demonstrate that the data provides fine scale phylogenetic resolution not available from Illumina amplicon methods targeting smaller variable regions of the 16S rRNA gene.


2015 ◽  
Author(s):  
Hans Verstraelen ◽  
Ramiro Vilchez-Vargas ◽  
Fabian Desimpel ◽  
Ruy Jauregui ◽  
Nele Vankeirsbilck ◽  
...  

Background. It is widely assumed that the uterine cavity in non-pregnant women is a sterile body environment under physiological conditions. We have previously shown that some women with overt dysbiosis of the vaginal microbiome, present with a polymicrobial Gardnerella vaginalis-dominated covering the endometrium, casting doubt over the paradigm of the sterilityof the human uterus. We therefore aimed to assess the putative presence of a uterine microbiome in a series of non-pregnant women through deep sequencing of the V1-2 hypervariable region of the 16S ribosomal RNA (rRNA) gene. Methods. We sampled the endometrial surface by use of a transcervical device designed to avoid contamination from the vagina and endocervix in nineteen non-pregnant women with reproductive failure in the absence of uterine anomalies on hysteroscopy. Following DNA extraction, the V1-2 region of the 16S rRNA gene was targeted using the 27F and 338R primers. By use of the Illumina MiSeq platform, 16S rRNA gene amplicon sequences were identified and annotated by use of the Ribosomal Database Project. Results. Out of 183 unique 16S rRNA gene amplicon sequences, 15 operational taxonomic units or phylotypes were present in all samples, possibly representing the uterine core microbiome, dominated by Bacteroides xylanivorans, Bacteroides thetaiotaomicron, Bacteroides fragilis, and Pelomonas. Accordingly, three bacterial phyla, Proteobacteria, Firmicutes and Bacteroidetes, were consistently present. In some women, the endometrial community was also characterized by a single abundant species co-occurring with the core microbiota, in particular Lactobacillus crispatus, Lactobacillus iners, and Prevotella amnii, while in two women the community was largely different. Discussion. Our findings are, albeit not necessarily generalizable, consistent with the presence of a unique microbiome residing on the endometrium of the human non-pregnant uterus in women of reproductive age. A majority of women showed a rather similar endometrial community, dominated by only a few Bacteroides and Pelomonas phylotypes. Consistent with our current understanding of the human microbiome, the uterine microbiome is likely to have a previously unrecognized role in uterine physiology and human reproduction. Further study is therefore warranted to document community ecology and dynamics of the uterine microbiota, as well as the role of the uterine microbiome in health and disease.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Abeer Babiker Idris ◽  
Hadeel Gassim Hassan ◽  
Maryam Atif Salaheldin Ali ◽  
Sulafa Mohamed Eltaher ◽  
Leena Babiker Idris ◽  
...  

Background. Helicobacter pylori (H. pylori) is ubiquitous among humans and one of the best-studied examples of an intimate association between bacteria and humans. Phylogeny and Phylogeography of H. pylori strains are known to mirror human migration patterns and reflect significant demographic events in human prehistory. In this study, we analyzed the molecular evolution of H. pylori strains detected from different tribes and regions of Sudan using 16S rRNA gene and the phylogenetic approach. Materials and methods. A total of 75 gastric biopsies were taken from patients who had been referred for endoscopy from different regions of Sudan. The DNA extraction was performed by using the guanidine chloride method. Two sets of primers (universal and specific for H. pylori) were used to amplify the 16S ribosomal gene. Sanger sequencing was applied, and the resulted sequences were matched with the sequences of the National Center for Biotechnology Information (NCBI) nucleotide database. The evolutionary aspects were analyzed using MEGA7 software. Results. Molecular detection of H. pylori has shown that 28 (37.33%) of the patients were positive for H. pylori and no significant differences were found in sociodemographic characteristics, endoscopy series, and H. pylori infection. Nucleotide variations were observed at five nucleotide positions (positions 219, 305, 578, 741, and 763–764), and one insertion mutation (750_InsC_751) was present in sixty-seven percent (7/12) of our strains. These six mutations were detected in regions of the 16S rRNA not closely associated with either tetracycline or tRNA binding sites; 66.67% of them were located in the central domain of 16S rRNA. The phylogenetic analysis of 16S rRNA sequences identified two lineages of H. pylori strains detected from different regions in Sudan. The presence of Sudanese H. pylori strains resembling Hungarian H. pylori strains could reflect the migration of Hungarian people to Sudan or vice versa. Conclusion. This finding emphasizes the significance of studying the phylogeny of H. pylori strains as a discriminatory tool to mirror human migration patterns. In addition, the 16S rRNA gene amplification method was found useful for bacterial identification and phylogeny.


Sign in / Sign up

Export Citation Format

Share Document