Species-level bacterial community profiling of the healthy sinonasal microbiome using Pacific Biosciences sequencing of full-length 16S rRNA genes

AbstractBackgroundPan-bacterial 16S rRNA microbiome surveys performed with massively parallel DNA sequencing technologies have transformed community microbiological studies. Current 16S profiling methods, however, fail to provide sufficient taxonomic resolution and accuracy to adequately perform species-level associative studies for specific conditions. This is due to the amplification and sequencing of only short 16S rRNA gene regions, typically providing for only family- or genus-level taxonomy. Moreover, sequencing errors often inflate the number of taxa present. Pacific Biosciences’ (PacBio’s) long-read technology in particular suffers from high error rates per base. Herein we present a microbiome analysis pipeline that takes advantage of PacBio circular consensus sequencing (CCS) technology to sequence and error correct full-length bacterial 16S rRNA genes, which provides high-fidelity species-level microbiome dataResultsAnalysis of a mock community with 20 bacterial species demonstrated 100% specificity and sensitivity. Examination of a 250-plus species mock community demonstrated correct species-level classification of >90% of taxa and relative abundances were accurately captured. The majority of the remaining taxa were demonstrated to be multiply, incorrectly, or incompletely classified. Using this methodology, we examined the microgeographic variation present among the microbiomes of six sinonasal sites, by both swab and biopsy, from the anterior nasal cavity to the sphenoid sinus from 12 subjects undergoing trans-sphenoidal hypophysectomy. We found greater variation among subjects than among sites within a subject, although significant within-individual differences were also observed.Propiniobacterium acnes(recently renamedCutibacterium acnes[1]) was the predominant species throughout, but was found at distinct relative abundances by site.ConclusionsOur microbial composition analysis pipeline for single-molecule real-time 16S rRNA gene sequencing (MCSMRT,https://github.com/jpearl01/mcsmrt) overcomes deficits of standard marker gene based microbiome analyses by using CCS of entire 16S rRNA genes to provide increased taxonomic and phylogenetic resolution. Extensions of this approach to other marker genes could help refine taxonomic assignments of microbial species and improve reference databases, as well as strengthen the specificity of associations between microbial communities and dysbiotic states.

Download Full-text

A method for high precision sequencing of near full-length 16S rRNA genes on an Illumina MiSeq

PeerJ ◽

10.7717/peerj.2492 ◽

2016 ◽

Vol 4 ◽

pp. e2492 ◽

Cited By ~ 29

Author(s):

Catherine M. Burke ◽

Aaron E. Darling

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

High Throughput ◽

Single Molecule ◽

Illumina Miseq ◽

Full Length ◽

16S Rrna Genes ◽

Rrna Genes ◽

Rrna Gene ◽

Bacterial Taxonomy

BackgroundThe bacterial 16S rRNA gene has historically been used in defining bacterial taxonomy and phylogeny. However, there are currently no high-throughput methods to sequence full-length 16S rRNA genes present in a sample with precision.ResultsWe describe a method for sequencing near full-length 16S rRNA gene amplicons using the high throughput Illumina MiSeq platform and test it using DNA from human skin swab samples. Proof of principle of the approach is demonstrated, with the generation of 1,604 sequences greater than 1,300 nt from a single Nano MiSeq run, with accuracy estimated to be 100-fold higher than standard Illumina reads. The reads were chimera filtered using information from a single molecule dual tagging scheme that boosts the signal available for chimera detection.ConclusionsThis method could be scaled up to generate many thousands of sequences per MiSeq run and could be applied to other sequencing platforms. This has great potential for populating databases with high quality, near full-length 16S rRNA gene sequences from under-represented taxa and environments and facilitates analyses of microbial communities at higher resolution.

Download Full-text

Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

10.7287/peerj.preprints.778v2 ◽

2016 ◽

Cited By ~ 2

Author(s):

Patrick D Schloss ◽

Matthew L Jenior ◽

Charles C. Koumpouras ◽

Sarah L Westcott ◽

Sarah K Highlander

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Dna Sequencing ◽

Error Rate ◽

Full Length ◽

Rrna Genes ◽

Rrna Gene ◽

Mock Community ◽

Sequencing Platforms ◽

The 16S Rrna Gene

Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality, but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

Download Full-text

Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

10.7287/peerj.preprints.778 ◽

2016 ◽

Author(s):

Patrick D Schloss ◽

Matthew L Jenior ◽

Charles C. Koumpouras ◽

Sarah L Westcott ◽

Sarah K Highlander

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Dna Sequencing ◽

Error Rate ◽

Full Length ◽

Rrna Genes ◽

Rrna Gene ◽

Mock Community ◽

Sequencing Platforms ◽

The 16S Rrna Gene

Download Full-text

Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

PeerJ ◽

10.7717/peerj.1869 ◽

2016 ◽

Vol 4 ◽

pp. e1869 ◽

Cited By ~ 125

Author(s):

Patrick D. Schloss ◽

Matthew L. Jenior ◽

Charles C. Koumpouras ◽

Sarah L. Westcott ◽

Sarah K. Highlander

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Dna Sequencing ◽

Error Rate ◽

Full Length ◽

Rrna Genes ◽

Rrna Gene ◽

Mock Community ◽

Sequencing Platforms ◽

The 16S Rrna Gene

Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina’s MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3–V5, V1–V3, V1–V5, V1–V6, and V1–V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1–V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina’s MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

Download Full-text

Species-level bacterial community profiling of the healthy sinonasal microbiome using Pacific Biosciences sequencing of full-length 16S rRNA genes

Microbiome ◽

10.1186/s40168-018-0569-2 ◽

2018 ◽

Vol 6 (1) ◽

Cited By ~ 25

Author(s):

Joshua P. Earl ◽

Nithin D. Adappa ◽

Jaroslaw Krol ◽

Archana S. Bhat ◽

Sergey Balashov ◽

...

Keyword(s):

Bacterial Community ◽

16S Rrna ◽

Species Level ◽

Full Length ◽

16S Rrna Genes ◽

Rrna Genes ◽

Pacific Biosciences ◽

Community Profiling

Download Full-text

Amplicon sequence variants artificially split bacterial genomes into separate clusters

10.1101/2021.02.26.433139 ◽

2021 ◽

Author(s):

Patrick D. Schloss

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Bacterial Genome ◽

Full Length ◽

16S Rrna Genes ◽

Rrna Genes ◽

Rrna Gene ◽

Bacterial Genomes ◽

A Genome ◽

The 16S Rrna Gene

AbstractAmplicon sequencing variants (ASVs) have been proposed as an alternative to operational taxonomic units (OTUs) for analyzing microbial communities. ASVs have grown in popularity, in part, because of a desire to reflect a more refined level of taxonomy since they do not cluster sequences based on a distance-based threshold. However, ASVs and the use of overly narrow thresholds to identify OTUs increase the risk of splitting a single genome into separate clusters. To assess this risk, I analyzed the intragenomic variation of 16S rRNA genes from the bacterial genomes represented in a rrn copy number database, which contained 20,427 genomes from 5,972 species. As the number of copies of the 16S rRNA gene increased in a genome, the number of ASVs also increased. There was an average of 0.58 ASVs per copy of the 16S rRNA gene for full length 16S rRNA genes. It was necessary to use a distance threshold of 5.25% to cluster full length ASVs from the same genome into a single OTU with 95% confidence for genomes with 7 copies of the 16S rRNA, such as E. coli. This research highlights the risk of splitting a single bacterial genome into separate clusters when ASVs are used to analyze 16S rRNA gene sequence data. Although there is also a risk of clustering ASVs from different species into the same OTU when using broad distance thresholds, those risks are of less concern than artificially splitting a genome into separate ASVs and OTUs.

Download Full-text

Intragenomic Heterogeneity of 16S rRNA Genes Causes Overestimation of Prokaryotic Diversity

Applied and Environmental Microbiology ◽

10.1128/aem.01282-13 ◽

2013 ◽

Vol 79 (19) ◽

pp. 5962-5969 ◽

Cited By ~ 156

Author(s):

Dong-Lei Sun ◽

Xuan Jiang ◽

Qinglong L. Wu ◽

Ning-Yi Zhou

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Full Length ◽

16S Rrna Genes ◽

Rrna Genes ◽

Rrna Gene ◽

Prokaryotic Diversity ◽

Prokaryotic Genomes ◽

General Guidance ◽

Different Levels

ABSTRACTEver since Carl Woese introduced the use of 16S rRNA genes for determining the phylogenetic relationships of prokaryotes, this method has been regarded as the “gold standard” in both microbial phylogeny and ecology studies. However, intragenomic heterogeneity within 16S rRNA genes has been reported in many investigations and is believed to bias the estimation of prokaryotic diversity. In the current study, 2,013 completely sequenced genomes of bacteria and archaea were analyzed and intragenomic heterogeneity was found in 952 genomes (585 species), with 87.5% of the divergence detected being below the 1% level. In particular, some extremophiles (thermophiles and halophiles) were found to harbor highly divergent 16S rRNA genes. Overestimation caused by 16S rRNA gene intragenomic heterogeneity was evaluated at different levels using the full-length and partial 16S rRNA genes usually chosen as targets for pyrosequencing. The result indicates that, at the unique level, full-length 16S rRNA genes can produce an overestimation of as much as 123.7%, while at the 3% level, an overestimation of 12.9% for the V6 region may be introduced. Further analysis showed that intragenomic heterogeneity tends to concentrate in specific positions, with the V1 and V6 regions suffering the most intragenomic heterogeneity and the V4 and V5 regions suffering the least intragenomic heterogeneity in bacteria. This is the most up-to-date overview of the diversity of 16S rRNA genes within prokaryotic genomes. It not only provides general guidance on how much overestimation can be introduced when applying 16S rRNA gene-based methods, due to its intragenomic heterogeneity, but also recommends that, for bacteria, this overestimation be minimized using primers targeting the V4 and V5 regions.

Download Full-text

Differentiation Of Clarias Batrachus, C. Gariepinus And Heteropneustes Fossilis By Pcr-Sequencing Of Mitochondrial 16s Rrna Gene

Journal of the Asiatic Society of Bangladesh Science ◽

10.3329/jasbs.v41i1.46190 ◽

2015 ◽

Vol 41 (1) ◽

pp. 51-58

Author(s):

Mohammad Shamimul Alam ◽

Hawa Jahan ◽

Rowshan Ara Begum ◽

Reza M Shahjahan

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Fish Larva ◽

16S Rrna Genes ◽

Rrna Genes ◽

Rrna Gene ◽

Valuable Insight ◽

Multiple Sequence ◽

Pcr Rflp ◽

Alternative Means

Heteropneustesfossilis, Clariasbatrachus and C. gariepinus are three major catfishes ofecological and economic importance. Identification of these fish species becomes aproblem when the usual external morphological features of the fish are lost or removed,such as in canned fish. Also, newly hatched fish larva is often difficult to identify. PCRsequencingprovides accurate alternative means of identification of individuals at specieslevel. So, 16S rRNA genes of three locally collected catfishes were sequenced after PCRamplification and compared with the same gene sequences available from othergeographical regions. Multiple sequence alignment of the 16S rRNA gene fragments ofthe catfish species has revealed polymorphic sites which can be used to differentiate thesethree species from one another and will provide valuable insight in choosing appropriaterestriction enzymes for PCR-RFLP based identification in future. Asiat. Soc. Bangladesh, Sci. 41(1): 51-58, June 2015

Download Full-text

Ultra-accurate microbial amplicon sequencing with synthetic long reads

Microbiome ◽

10.1186/s40168-021-01072-3 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Benjamin J. Callahan ◽

Dmitry Grinevich ◽

Siddhartha Thakur ◽

Michael A. Balamotis ◽

Tuval Ben Yehezkel

Keyword(s):

Microbial Community ◽

16S Rrna ◽

Amplicon Sequencing ◽

Species Level ◽

Full Length ◽

16S Rrna Genes ◽

Rrna Genes ◽

Strain Identification ◽

Long Reads ◽

Long Read

Abstract Background Out of the many pathogenic bacterial species that are known, only a fraction are readily identifiable directly from a complex microbial community using standard next generation DNA sequencing. Long-read sequencing offers the potential to identify a wider range of species and to differentiate between strains within a species, but attaining sufficient accuracy in complex metagenomes remains a challenge. Methods Here, we describe and analytically validate LoopSeq, a commercially available synthetic long-read (SLR) sequencing technology that generates highly accurate long reads from standard short reads. Results LoopSeq reads are sufficiently long and accurate to identify microbial genes and species directly from complex samples. LoopSeq perfectly recovered the full diversity of 16S rRNA genes from known strains in a synthetic microbial community. Full-length LoopSeq reads had a per-base error rate of 0.005%, which exceeds the accuracy reported for other long-read sequencing technologies. 18S-ITS and genomic sequencing of fungal and bacterial isolates confirmed that LoopSeq sequencing maintains that accuracy for reads up to 6 kb in length. LoopSeq full-length 16S rRNA reads could accurately classify organisms down to the species level in rinsate from retail meat samples, and could differentiate strains within species identified by the CDC as potential foodborne pathogens. Conclusions The order-of-magnitude improvement in length and accuracy over standard Illumina amplicon sequencing achieved with LoopSeq enables accurate species-level and strain identification from complex- to low-biomass microbiome samples. The ability to generate accurate and long microbiome sequencing reads using standard short read sequencers will accelerate the building of quality microbial sequence databases and removes a significant hurdle on the path to precision microbial genomics.

Download Full-text

Microbiological and Geochemical Heterogeneity in an In Situ Uranium Bioremediation Field Site

Applied and Environmental Microbiology ◽

10.1128/aem.71.10.6308-6318.2005 ◽

2005 ◽

Vol 71 (10) ◽

pp. 6308-6318 ◽

Cited By ~ 172

Author(s):

Helen A. Vrionis ◽

Robert T. Anderson ◽

Irene Ortiz-Bernad ◽

Kathleen R. O'Neill ◽

Charles T. Resch ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

16S Rrna Genes ◽

Rrna Genes ◽

Acid Volatile Sulfide ◽

Rrna Gene ◽

Gene Sequences ◽

16S Rrna Gene Sequences ◽

Geochemical Heterogeneity

ABSTRACT The geochemistry and microbiology of a uranium-contaminated subsurface environment that had undergone two seasons of acetate addition to stimulate microbial U(VI) reduction was examined. There were distinct horizontal and vertical geochemical gradients that could be attributed in large part to the manner in which acetate was distributed in the aquifer, with more reduction of Fe(III) and sulfate occurring at greater depths and closer to the point of acetate injection. Clone libraries of 16S rRNA genes derived from sediments and groundwater indicated an enrichment of sulfate-reducing bacteria in the order Desulfobacterales in sediment and groundwater samples. These samples were collected nearest the injection gallery where microbially reducible Fe(III) oxides were highly depleted, groundwater sulfate concentrations were low, and increases in acid volatile sulfide were observed in the sediment. Further down-gradient, metal-reducing conditions were present as indicated by intermediate Fe(II)/Fe(total) ratios, lower acid volatile sulfide values, and increased abundance of 16S rRNA gene sequences belonging to the dissimilatory Fe(III)- and U(VI)-reducing family Geobacteraceae. Maximal Fe(III) and U(VI) reduction correlated with maximal recovery of Geobacteraceae 16S rRNA gene sequences in both groundwater and sediment; however, the sites at which these maxima occurred were spatially separated within the aquifer. The substantial microbial and geochemical heterogeneity at this site demonstrates that attempts should be made to deliver acetate in a more uniform manner and that closely spaced sampling intervals, horizontally and vertically, in both sediment and groundwater are necessary in order to obtain a more in-depth understanding of microbial processes and the relative contribution of attached and planktonic populations to in situ uranium bioremediation.

Download Full-text