High-throughput amplicon sequencing of the full-length 16S rRNA gene with single-nucleotide resolution

AbstractTargeted PCR amplification and high-throughput sequencing (amplicon sequencing) of 16S rRNA gene fragments is widely used to profile microbial communities. New long-read sequencing technologies can sequence the entire 16S rRNA gene, but higher error rates have limited their attractiveness when accuracy is important. Here we present a high-throughput amplicon sequencing methodology based on PacBio circular consensus sequencing and the DADA2 sample inference method that measures the full-length 16S rRNA gene with single-nucleotide resolution and a near-zero error rate. In two artificial communities of known composition, our method recovered the full complement of full-length 16S sequence variants from expected community members without residual errors. The measured abundances of intra-genomic sequence variants were in the integral ratios expected from the genuine allelic variants within a genome. The full-length 16S gene sequences recovered by our approach allowed Escherichia coli strains to be correctly classified to the O157:H7 and K12 sub-species clades. In human fecal samples, our method showed strong technical replication and was able to recover the full complement of 16S rRNA alleles in several E. coli strains. There are likely many applications beyond microbial profiling for which high-throughput amplicon sequencing of complete genes with single-nucleotide resolution will be of use.

Download Full-text

Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax)

mBio ◽

10.1128/mbio.01557-20 ◽

2020 ◽

Vol 11 (5) ◽

Cited By ~ 2

Author(s):

Morten Simonsen Dueholm ◽

Kasper Skytte Andersen ◽

Simon Jon McIlroy ◽

Jannie Munk Kristensen ◽

Erika Yashiro ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

High Throughput ◽

Amplicon Sequencing ◽

Species Level ◽

Full Length ◽

Rrna Gene ◽

High Identity ◽

Reference Databases ◽

Reference Sequences

ABSTRACT High-throughput 16S rRNA gene amplicon sequencing is an essential method for studying the diversity and dynamics of microbial communities. However, this method is presently hampered by the lack of high-identity reference sequences for many environmental microbes in the public 16S rRNA gene reference databases and by the absence of a systematic and comprehensive taxonomy for the uncultured majority. Here, we demonstrate how high-throughput synthetic long-read sequencing can be applied to create ecosystem-specific full-length 16S rRNA gene amplicon sequence variant (FL-ASV) resolved reference databases that include high-identity references (>98.7% identity) for nearly all abundant bacteria (>0.01% relative abundance) using Danish wastewater treatment systems and anaerobic digesters as an example. In addition, we introduce a novel sequence identity-based approach for automated taxonomy assignment (AutoTax) that provides a complete seven-rank taxonomy for all reference sequences, using the SILVA taxonomy as a backbone, with stable placeholder names for unclassified taxa. The FL-ASVs are perfectly suited for the evaluation of taxonomic resolution and bias associated with primers commonly used for amplicon sequencing, allowing researchers to choose those that are ideal for their ecosystem. Reference databases processed with AutoTax greatly improves the classification of short-read 16S rRNA ASVs at the genus- and species-level, compared with the commonly used universal reference databases. Importantly, the placeholder names provide a way to explore the unclassified environmental taxa at different taxonomic ranks, which in combination with in situ analyses can be used to uncover their ecological roles.

Download Full-text

Generation of comprehensive ecosystems-specific reference databases with species-level resolution by high-throughput full-length 16S rRNA gene sequencing and automated taxonomy assignment (AutoTax)

10.1101/672873 ◽

2019 ◽

Cited By ~ 11

Author(s):

Morten Simonsen Dueholm ◽

Kasper Skytte Andersen ◽

Simon Jon McIlroy ◽

Jannie Munk Kristensen ◽

Erika Yashiro ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

High Throughput ◽

Amplicon Sequencing ◽

Species Level ◽

Full Length ◽

Rrna Gene ◽

High Identity ◽

Reference Databases ◽

Reference Sequences

AbstractHigh-throughput 16S rRNA gene amplicon sequencing is an essential method for studying the diversity and dynamics of microbial communities. However, this method is presently hampered by the lack of high-identity reference sequences for many environmental microbes in the public 16S rRNA gene reference databases, and by the absence of a systematic and comprehensive taxonomy for the uncultured majority. Here we demonstrate how high-throughput synthetic long-read sequencing can be applied to create ecosystem-specific full-length 16S rRNA gene amplicon sequence variant (FL-ASV) reference databases that include high-identity references (>98.7% identity) for nearly all abundant bacteria (>0.01% relative abundance) using Danish wastewater treatment systems and anaerobic digesters as an example. In addition, we introduce a novel sequence identity-based approach for automated taxonomy assignment (AutoTax) that provides a complete seven-rank taxonomy for all reference sequences, using the SILVA taxonomy as a backbone, with stable placeholder names for unclassified taxa. The FL-ASVs are perfectly suited for the evaluation of taxonomic resolution and bias associated with primers commonly used for amplicon sequencing, allowing researchers to choose those that are ideal for their ecosystem. The AutoTax taxonomy greatly improves the classification of short-read 16S rRNA gene amplicon sequence variants (ASVs) at the genus- and species-level, compared to the commonly used universal reference databases. Importantly, the placeholder names provide a way to explore the unclassified environmental taxa at different taxonomic ranks, which in combination with in situ analyses can be used to uncover their ecological roles.

Download Full-text

A method for high precision sequencing of near full-length 16S rRNA genes on an Illumina MiSeq

PeerJ ◽

10.7717/peerj.2492 ◽

2016 ◽

Vol 4 ◽

pp. e2492 ◽

Cited By ~ 29

Author(s):

Catherine M. Burke ◽

Aaron E. Darling

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

High Throughput ◽

Single Molecule ◽

Illumina Miseq ◽

Full Length ◽

16S Rrna Genes ◽

Rrna Genes ◽

Rrna Gene ◽

Bacterial Taxonomy

BackgroundThe bacterial 16S rRNA gene has historically been used in defining bacterial taxonomy and phylogeny. However, there are currently no high-throughput methods to sequence full-length 16S rRNA genes present in a sample with precision.ResultsWe describe a method for sequencing near full-length 16S rRNA gene amplicons using the high throughput Illumina MiSeq platform and test it using DNA from human skin swab samples. Proof of principle of the approach is demonstrated, with the generation of 1,604 sequences greater than 1,300 nt from a single Nano MiSeq run, with accuracy estimated to be 100-fold higher than standard Illumina reads. The reads were chimera filtered using information from a single molecule dual tagging scheme that boosts the signal available for chimera detection.ConclusionsThis method could be scaled up to generate many thousands of sequences per MiSeq run and could be applied to other sequencing platforms. This has great potential for populating databases with high quality, near full-length 16S rRNA gene sequences from under-represented taxa and environments and facilitates analyses of microbial communities at higher resolution.

Download Full-text

Superior resolution characterisation of microbial diversity in anaerobic digesters using full-length 16S rRNA gene amplicon sequencing

Water Research ◽

10.1016/j.watres.2020.115815 ◽

2020 ◽

Vol 178 ◽

pp. 115815 ◽

Cited By ~ 2

Author(s):

Theo Y.C. Lam ◽

Ran Mei ◽

Zhuoying Wu ◽

Patrick K.H. Lee ◽

Wen-Tso Liu ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Microbial Diversity ◽

Amplicon Sequencing ◽

Full Length ◽

Rrna Gene ◽

Anaerobic Digesters

Download Full-text

Establishment and Assessment of An Amplicon Sequencing Method Targeting The 16S-ITS-23S rRNA Operon For Analysis of The Equine Gut Microbiome

10.21203/rs.3.rs-156589/v1 ◽

2021 ◽

Author(s):

Yuta Kinoshita ◽

Hidekazu NIWA ◽

Eri UCHIDA-FUJII ◽

Toshio NUKADA

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Amplicon Sequencing ◽

Full Length ◽

Rrna Operon ◽

Rrna Genes ◽

Taxonomic Resolution ◽

23S Rrna ◽

Rrna Gene ◽

Fecal Samples

Abstract Microbial communities are commonly studied by using amplicon sequencing of part of the 16S rRNA gene. Sequencing of the full-length 16S rRNA gene can provide higher taxonomic resolution and accuracy. To obtain even higher taxonomic resolution, with as few false-positives as possible, we assessed a method using long amplicon sequencing targeting the rRNA operon combined with a CCMetagen pipeline. Taxonomic assignment had >90% accuracy at the species level in a mock sample and at the family level in equine fecal samples, generating similar taxonomic composition as shotgun sequencing. The rRNA operon amplicon sequencing of equine fecal samples underestimated compositional percentages of bacterial strains containing unlinked rRNA genes by a third to almost a half, but unlinked rRNA genes had a limited effect on the overall results. The rRNA operon amplicon sequencing with the A519F + U2428R primer set was able to reflect archaeal genomes, whereas full-length 16S rRNA with 27F + 1492R could not. Therefore, we conclude that amplicon sequencing targeting the rRNA operon captures more detailed variations of bacterial and archaeal microbiota.

Download Full-text

Exploring the roles of and interactions among microbes in dry co-digestion of food waste and pig manure using high-throughput 16S rRNA gene amplicon sequencing

Biotechnology for Biofuels ◽

10.1186/s13068-018-1344-0 ◽

2019 ◽

Vol 12 (1) ◽

Cited By ~ 10

Author(s):

Yan Jiang ◽

Conor Dennehy ◽

Peadar G. Lawlor ◽

Zhenhu Hu ◽

Matthew McCabe ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

High Throughput ◽

Food Waste ◽

Amplicon Sequencing ◽

Pig Manure ◽

Rrna Gene

Download Full-text

Synthetic spike-in standards for high-throughput 16S rRNA gene amplicon sequencing

Nucleic Acids Research ◽

10.1093/nar/gkw984 ◽

2016 ◽

pp. gkw984 ◽

Cited By ~ 23

Author(s):

Dieter M. Tourlousse ◽

Satowa Yoshiike ◽

Akiko Ohashi ◽

Satoko Matsukura ◽

Naohiro Noda ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

High Throughput ◽

Amplicon Sequencing ◽

Rrna Gene

Download Full-text

rpoB, a promising marker for analyzing the diversity of bacterial communities by amplicon sequencing

10.21203/rs.2.9507/v2 ◽

2019 ◽

Author(s):

Jean-Claude OGIER ◽

Sylvie Pagès ◽

Maxime Galan ◽

Matthieu Barret ◽

Sophie Gaudriault

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

High Throughput ◽

Amplicon Sequencing ◽

Taxonomic Resolution ◽

Taxonomic Structure ◽

Rrna Gene ◽

Specificity And Sensitivity ◽

The 16S Rrna Gene ◽

Mock Communities

Abstract Background Microbiome composition is frequently studied by the amplification and high-throughput sequencing of specific molecular markers (metabarcoding). Various hypervariable regions of the 16S rRNA gene are classically used to estimate bacterial diversity, but other universal bacterial markers with a finer taxonomic resolution could be employed. We compared specificity and sensitivity between a portion of the rpoB gene and the V3V4 hypervariable region of the 16S rRNA gene. Results We first designed universal primers for rpoB suitable for use with Illumina sequencing-based technology and constructed a reference rpoB database of 45,000 sequences. The rpoB and V3V4 markers were amplified and sequenced from (i) a mock community of 19 bacterial strains from both Gram-negative and Gram-positive lineages; (ii) bacterial assemblages associated with entomopathogenic nematodes. In metabarcoding analyses of mock communities with two analytical pipelines (FROGS and DADA2), the estimated diversity captured with the rpoB marker resembled the expected composition of these mock communities more closely than that captured with V3V4. The rpoB marker had a higher level of taxonomic affiliation, a higher sensitivity (detection of all the species present in the mock communities), and a higher specificity (low rates of spurious OTU detection) than V3V4. We applied both primers to infective juveniles of the nematode Steinernema glaseri. Both markers showed the bacterial community associated with this nematode to be of low diversity (< 50 OTUs), but only rpoB reliably detected the symbiotic bacterium Xenorhabdus poinarii. Conclusions Our results confirm that different microbiota composition data may be obtained with different markers. We found that rpoB was a highly appropriate marker for assessing the taxonomic structure of mock communities and the nematode microbiota. Further studies on other ecosystems should be considered to evaluate the universal usefulness of the rpoB marker. Our data highlight two crucial elements that should be taken into account to ensure more reliable and accurate descriptions of microbial diversity in high-throughput amplicon sequencing analyses: i) the need to include mock communities as controls; ii) the advantages of using a multigenic approach including at least one housekeeping gene (rpoB is a good candidate) and one variable region of the 16S rRNA gene.

Download Full-text

rpoB, a promising marker for analyzing the diversity of bacterial communities by amplicon sequencing

10.21203/rs.2.9507/v1 ◽

2019 ◽

Author(s):

Jean-Claude OGIER ◽

Sylvie Pagès ◽

Maxime Galan ◽

Matthieu Barret ◽

Sophie Gaudriault

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

High Throughput ◽

Amplicon Sequencing ◽

Taxonomic Resolution ◽

Taxonomic Structure ◽

Rrna Gene ◽

Specificity And Sensitivity ◽

The 16S Rrna Gene ◽

Mock Communities

Abstract Background Microbiome composition is frequently studied by the amplification and high-throughput sequencing of specific molecular markers (metabarcoding). Various hypervariable regions of the 16S rRNA gene are classically used to estimate bacterial diversity, but other universal bacterial markers with a finer taxonomic resolution could be employed. We compared specificity and sensitivity between a portion of the rpoB gene and the V3V4 hypervariable region of the 16S rRNA gene. Results We first designed universal primers for rpoB suitable for use with Illumina sequencing-based technology and constructed a reference rpoB database of 45,000 sequences. The rpoB and V3V4 markers were amplified and sequenced from (i) a mock community of 19 bacterial strains from both Gram-negative and Gram-positive lineages; (ii) bacterial assemblages associated with entomopathogenic nematodes. In metabarcoding analyses of mock communities with two analytical pipelines (FROGS and DADA2), the estimated diversity captured with the rpoB marker resembled the expected composition of these mock communities more closely than that captured with V3V4. The rpoB marker had a higher level of taxonomic affiliation, a higher sensitivity (detection of all the species present in the mock communities), and a higher specificity (low rates of spurious OTU detection) than V3V4. We applied both primers to infective juveniles of the nematode Steinernema glaseri. Both markers showed the bacterial community associated with this nematode to be of low diversity (< 50 OTUs), but only rpoB reliably detected the symbiotic bacterium Xenorhabdus poinarii. Conclusions Our results confirm that different microbiota composition data may be obtained with different markers. We found that rpoB was a highly appropriate marker for assessing the taxonomic structure of mock communities and the nematode microbiota. Further studies on other ecosystems should be considered to evaluate the universal usefulness of the rpoB marker. Our data highlight two crucial elements that should be taken into account to ensure more reliable and accurate descriptions of microbial diversity in high-throughput amplicon sequencing analyses: i) the need to include mock communities as controls; ii) the advantages of using a multigenic approach including at least one housekeeping gene (rpoB is a good candidate) and one variable region of the 16S rRNA gene.

Download Full-text