scholarly journals Beyond Taxonomic Identification: Integration of Ecological Responses to a Soil Bacterial 16S rRNA Gene Database

2021 ◽  
Vol 12 ◽  
Author(s):  
Briony Jones ◽  
Tim Goodall ◽  
Paul B. L. George ◽  
Hyun S. Gweon ◽  
Jeremy Puissant ◽  
...  

High-throughput sequencing 16S rRNA gene surveys have enabled new insights into the diversity of soil bacteria, and furthered understanding of the ecological drivers of abundances across landscapes. However, current analytical approaches are of limited use in formalizing syntheses of the ecological attributes of taxa discovered, because derived taxonomic units are typically unique to individual studies and sequence identification databases only characterize taxonomy. To address this, we used sequences obtained from a large nationwide soil survey (GB Countryside Survey, henceforth CS) to create a comprehensive soil specific 16S reference database, with coupled ecological information derived from survey metadata. Specifically, we modeled taxon responses to soil pH at the OTU level using hierarchical logistic regression (HOF) models, to provide information on both the shape of landscape scale pH-abundance responses, and pH optima (pH at which OTU abundance is maximal). We identify that most of the soil OTUs examined exhibited a non-flat relationship with soil pH. Further, the pH optima could not be generalized by broad taxonomy, highlighting the need for tools and databases synthesizing ecological traits at finer taxonomic resolution. We further demonstrate the utility of the database by testing against geographically dispersed query 16S datasets; evaluating efficacy by quantifying matches, and accuracy in predicting pH responses of query sequences from a separate large soil survey. We found that the CS database provided good coverage of dominant taxa; and that the taxa indicating soil pH in a query dataset corresponded with the pH classifications of top matches in the CS database. Furthermore we were able to predict query dataset community structure, using predicted abundances of dominant taxa based on query soil pH data and the HOF models of matched CS database taxa. The database with associated HOF model outputs is released as an online portal for querying single sequences of interest (https://shiny-apps.ceh.ac.uk/ID-TaxER/), and flat files are made available for use in bioinformatic pipelines. The further development of advanced informatics infrastructures incorporating modeled ecological attributes along with new functional genomic information will likely facilitate large scale exploration and prediction of soil microbial functional biodiversity under current and future environmental change scenarios.

2019 ◽  
Author(s):  
Briony A. Jones ◽  
Tim Goodall ◽  
Paul George ◽  
Hyun Soon Gweon ◽  
Jeremy Puissant ◽  
...  

AbstractHigh-throughput sequencing 16S rRNA gene surveys have enabled new insights into the diversity of soil bacteria, and furthered understanding of the ecological drivers of abundances across landscapes. However, current analytical approaches are of limited use in formalising syntheses of the ecological attributes of taxa discovered, because derived taxonomic units are typically unique to individual studies and sequence identification databases only characterise taxonomy. To address this, we used sequences obtained from a large nationwide soil survey (GB Countryside Survey, henceforth CS) to create a comprehensive soil specific 16S reference database, with coupled ecological information derived from the survey metadata. Specifically, we modelled taxon responses to soil pH at the OTU level using hierarchical logistic regression (HOF) models, to provide information on putative landscape scale pH-abundance responses. We identify that most of the soil OTUs examined exhibit predictable abundance responses across soil pH gradients, though with the exception of known acidophilic lineages, the pH optima of OTU relative abundance was variable and could not be generalised by broad taxonomy. This highlights the need for tools and databases to predict ecological traits at finer taxonomic resolution. We further demonstrate the utility of the database by testing against geographically dispersed query 16S datasets; evaluating efficacy by quantifying matches, and accuracy in predicting pH responses of query sequences from a separate large soil survey. We found that the CS database provided good coverage of dominant taxa; and that the taxa indicating soil pH in a query dataset corresponded with the pH classifications of top matches in the CS database. Furthermore we were able to predict query dataset community structure, using predicted abundances of dominant taxa based on query soil pH data and the HOF models of matched CS database taxa. The database with associated HOF model outputs is released as an online portal for querying single sequences of interest (https://shiny-apps.ceh.ac.uk/ID-TaxER/), and flat files are made available for use in bioinformatic pipelines. The further development of advanced informatics infrastructures incorporating modelled ecological attributes along with new functional genomic information will likely facilitate large scale exploration and prediction of soil microbial functional biodiversity under current and future environmental change scenarios.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 441-442
Author(s):  
Adrian Maynez-Perez ◽  
Francisco Jahuey-Martinez ◽  
Jose A Martinez-Quintana ◽  
Michael E Hume ◽  
Robin C Anderson ◽  
...  

Abstract Raramuri Criollo cattle from the Chihuahuan desert in northern Mexico have been described as an ecological ecotype due to their enormous advantage in land grass utilization and their capacity to diversify their diet with cacti, forbs and woody plants. This diversification in diet utilization, could reflect upon their microbiome composition. The aim of this study was to characterize the rumen microbiome of Raramuri criollo cattle and to compare it to other lineages that graze in the same area. A total of 28 cows representing three linages [Criollo (n = 13), European (n = 9) and Criollo x European Crossbred (n = 6)] were grazed without supplementation for 45 days. DNA was extracted from ruminal samples and the V4 region of the 16S rRNA gene was sequenced on an Illumina platform. Data were analyzed with the QIIME2 software package and DADA2 plugin and the amplicon sequence variants were taxonomically classified with naïve Bayesian using the SILVA 16S rRNA gene reference database (version 132). Statistical analysis was performed by ANOVA and PERMANOVA for alpha and beta diversity indexes, respectively, and the non-strict version of linear discriminant analysis effect size (LEfSe) was used to determine significantly different taxa among lineages. Differences in beta diversity indexes (P < 0.05) were found in ruminal microbiome composition between Criollo and European groups, whereas the Crossbred showed intermediate values when compared to the pure breeds (Table 1). LEfSe analysis identified a total of 20 bacterial groups that explained differences between lineages, including one for Crossbreed, ten for European and nine for Criollo. These results show ruminal microbiome differences between Raramuri criollo cattle and the mainstream European breeds used in the northern Mexico Chihuahuan desert and reflect that those differences could be a consequence of dissimilar grazing behavior.


2015 ◽  
Vol 15 (6) ◽  
pp. 1435-1445 ◽  
Author(s):  
Johan Decelle ◽  
Sarah Romac ◽  
Rowena F. Stern ◽  
El Mahdi Bendif ◽  
Adriana Zingone ◽  
...  

2019 ◽  
Author(s):  
Till Robin Lesker ◽  
Abilash Chakravarthy ◽  
Eric. J.C. Gálvez ◽  
Ilias Lagkouvardos ◽  
John F. Baines ◽  
...  

AbstractThe vast complexity of host-associated microbial ecosystems requires generation of host-specific gene catalogs to survey the functions and diversity of these communities. We generated a comprehensive resource, the integrated mouse gut metagenome catalog (iMGMC), comprising 4.6 million unique genes and 660 high-quality metagenome-assembled genomes (MAGs) linked to reconstructed full-length 16S rRNA gene sequences. iMGMC enables unprecedented coverage and taxonomic resolution, i.e. more than 89% of the identified taxa are not represented in any other databases. The tool (github.com/tillrobin/iMGMC) allowed characterizing the diversity and functions of prevalent and previously unknown microbial community members along the gastrointestinal tract. Moreover, we show that integration of MAGs and 16S rRNA gene data allows a more accurate prediction of functional profiles of communities than based on 16S rRNA amplicons alone. Integrated gene catalogs such as iMGMC are needed to enhance the resolution of numerous existing and future sequencing-based studies.


2021 ◽  
Author(s):  
Yuta Kinoshita ◽  
Hidekazu NIWA ◽  
Eri UCHIDA-FUJII ◽  
Toshio NUKADA

Abstract Microbial communities are commonly studied by using amplicon sequencing of part of the 16S rRNA gene. Sequencing of the full-length 16S rRNA gene can provide higher taxonomic resolution and accuracy. To obtain even higher taxonomic resolution, with as few false-positives as possible, we assessed a method using long amplicon sequencing targeting the rRNA operon combined with a CCMetagen pipeline. Taxonomic assignment had >90% accuracy at the species level in a mock sample and at the family level in equine fecal samples, generating similar taxonomic composition as shotgun sequencing. The rRNA operon amplicon sequencing of equine fecal samples underestimated compositional percentages of bacterial strains containing unlinked rRNA genes by a third to almost a half, but unlinked rRNA genes had a limited effect on the overall results. The rRNA operon amplicon sequencing with the A519F + U2428R primer set was able to reflect archaeal genomes, whereas full-length 16S rRNA with 27F + 1492R could not. Therefore, we conclude that amplicon sequencing targeting the rRNA operon captures more detailed variations of bacterial and archaeal microbiota.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Marco Meola ◽  
Etienne Rifa ◽  
Noam Shani ◽  
Céline Delbès ◽  
Hélène Berthoud ◽  
...  

mSphere ◽  
2018 ◽  
Vol 3 (5) ◽  
Author(s):  
Robin R. Rohwer ◽  
Joshua J. Hamilton ◽  
Ryan J. Newton ◽  
Katherine D. McMahon

ABSTRACT Taxonomy assignment of freshwater microbial communities is limited by the minimally curated phylogenies used for large taxonomy databases. Here we introduce TaxAss, a taxonomy assignment workflow that classifies 16S rRNA gene amplicon data using two taxonomy reference databases: a large comprehensive database and a small ecosystem-specific database rigorously curated by scientists within a field. We applied TaxAss to five different freshwater data sets using the comprehensive SILVA database and the freshwater-specific FreshTrain database. TaxAss increased the percentage of the data set classified compared to using only SILVA, especially at fine-resolution family to species taxon levels, while across the freshwater test data sets classifications increased by as much as 11 to 40% of total reads. A similar increase in classifications was not observed in a control mouse gut data set, which was not expected to contain freshwater bacteria. TaxAss also maintained taxonomic richness compared to using only the FreshTrain across all taxon levels from phylum to species. Without TaxAss, most organisms not represented in the FreshTrain were unclassified, but at fine taxon levels, incorrect classifications became significant. We validated TaxAss using simulated amplicon data derived from full-length clone libraries and found that 96 to 99% of test sequences were correctly classified at fine resolution. TaxAss splits a data set’s sequences into two groups based on their percent identity to reference sequences in the ecosystem-specific database. Sequences with high similarity to sequences in the ecosystem-specific database are classified using that database, and the others are classified using the comprehensive database. TaxAss is free and open source and is available at https://www.github.com/McMahonLab/TaxAss. IMPORTANCE Microbial communities drive ecosystem processes, but microbial community composition analyses using 16S rRNA gene amplicon data sets are limited by the lack of fine-resolution taxonomy classifications. Coarse taxonomic groupings at the phylum, class, and order levels lump ecologically distinct organisms together. To avoid this, many researchers define operational taxonomic units (OTUs) based on clustered sequences, sequence variants, or unique sequences. These fine-resolution groupings are more ecologically relevant, but OTU definitions are data set dependent and cannot be compared between data sets. Microbial ecologists studying freshwater have curated a small, ecosystem-specific taxonomy database to provide consistent and up-to-date terminology. We created TaxAss, a workflow that leverages this database to assign taxonomy. We found that TaxAss improves fine-resolution taxonomic classifications (family, genus, and species). Fine taxonomic groupings are more ecologically relevant, so they provide an alternative to OTU-based analyses that is consistent and comparable between data sets.


2014 ◽  
Vol 16 (8) ◽  
pp. 2389-2407 ◽  
Author(s):  
Stefan Pfeiffer ◽  
Milica Pastar ◽  
Birgit Mitter ◽  
Kathrin Lippert ◽  
Evelyn Hackl ◽  
...  

2019 ◽  
Author(s):  
Jean-Claude OGIER ◽  
Sylvie Pagès ◽  
Maxime Galan ◽  
Matthieu Barret ◽  
Sophie Gaudriault

Abstract Background Microbiome composition is frequently studied by the amplification and high-throughput sequencing of specific molecular markers (metabarcoding). Various hypervariable regions of the 16S rRNA gene are classically used to estimate bacterial diversity, but other universal bacterial markers with a finer taxonomic resolution could be employed. We compared specificity and sensitivity between a portion of the rpoB gene and the V3V4 hypervariable region of the 16S rRNA gene. Results We first designed universal primers for rpoB suitable for use with Illumina sequencing-based technology and constructed a reference rpoB database of 45,000 sequences. The rpoB and V3V4 markers were amplified and sequenced from (i) a mock community of 19 bacterial strains from both Gram-negative and Gram-positive lineages; (ii) bacterial assemblages associated with entomopathogenic nematodes. In metabarcoding analyses of mock communities with two analytical pipelines (FROGS and DADA2), the estimated diversity captured with the rpoB marker resembled the expected composition of these mock communities more closely than that captured with V3V4. The rpoB marker had a higher level of taxonomic affiliation, a higher sensitivity (detection of all the species present in the mock communities), and a higher specificity (low rates of spurious OTU detection) than V3V4. We applied both primers to infective juveniles of the nematode Steinernema glaseri. Both markers showed the bacterial community associated with this nematode to be of low diversity (< 50 OTUs), but only rpoB reliably detected the symbiotic bacterium Xenorhabdus poinarii. Conclusions Our results confirm that different microbiota composition data may be obtained with different markers. We found that rpoB was a highly appropriate marker for assessing the taxonomic structure of mock communities and the nematode microbiota. Further studies on other ecosystems should be considered to evaluate the universal usefulness of the rpoB marker. Our data highlight two crucial elements that should be taken into account to ensure more reliable and accurate descriptions of microbial diversity in high-throughput amplicon sequencing analyses: i) the need to include mock communities as controls; ii) the advantages of using a multigenic approach including at least one housekeeping gene (rpoB is a good candidate) and one variable region of the 16S rRNA gene.


Sign in / Sign up

Export Citation Format

Share Document