scholarly journals Consistent and correctable bias in metagenomic sequencing experiments

eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Michael R McLaren ◽  
Amy D Willis ◽  
Benjamin J Callahan

Marker-gene and metagenomic sequencing have profoundly expanded our ability to measure biological communities. But the measurements they provide differ from the truth, often dramatically, because these experiments are biased toward detecting some taxa over others. This experimental bias makes the taxon or gene abundances measured by different protocols quantitatively incomparable and can lead to spurious biological conclusions. We propose a mathematical model for how bias distorts community measurements based on the properties of real experiments. We validate this model with 16S rRNA gene and shotgun metagenomics data from defined bacterial communities. Our model better fits the experimental data despite being simpler than previous models. We illustrate how our model can be used to evaluate protocols, to understand the effect of bias on downstream statistical analyses, and to measure and correct bias given suitable calibration controls. These results illuminate new avenues toward truly quantitative and reproducible metagenomics measurements.

2019 ◽  
Author(s):  
Michael R. McLaren ◽  
Amy D. Willis ◽  
Benjamin J. Callahan

AbstractMeasurements of biological communities by marker-gene and metagenomic sequencing are biased: The measured relative abundances of taxa or their genes are systematically distorted from their true values because each step in the experimental workflow preferentially detects some taxa over others. Bias can lead to qualitatively incorrect conclusions and makes measurements from different protocols quantitatively incomparable. A rigorous understanding of bias is therefore essential. Here we propose, test, and apply a simple mathematical model of how bias distorts marker-gene and metagenomics measurements: Bias multiplies the true relative abundances within each sample by taxon-and protocol-specific factors that describe the different efficiencies with which taxa are detected by the workflow. Critically, these factors are consistent across samples with different compositions, allowing bias to be estimated and corrected. We validate this model in 16S rRNA gene and shotgun metagenomics data from bacterial communities with defined compositions. We use it to reason about the effects of bias on downstream statistical analyses, finding that analyses based on taxon ratios are less sensitive to bias than analyses based on taxon proportions. Finally, we demonstrate how this model can be used to quantify bias from samples of defined composition, partition bias into steps such as DNA extraction and PCR amplification, and to correct biased measurements. Our model improves on previous models by providing a better fit to experimental data and by providing a composition-independent approach to analyzing, measuring, and correcting bias.


mSystems ◽  
2020 ◽  
Vol 5 (4) ◽  
Author(s):  
Ganesh Babu Malli Mohan ◽  
Ceth W. Parker ◽  
Camilla Urbaniak ◽  
Nitin K. Singh ◽  
Anthony Hood ◽  
...  

ABSTRACT Microbial contamination during long-term confinements of space exploration presents potential risks for both crew members and spacecraft life support systems. A novel swab kit was used to sample various surfaces from a submerged, closed, analog habitat to characterize the microbial populations. Samples were collected from various locations across the habitat which were constructed from various surface materials (linoleum, dry wall, particle board, glass, and metal), and microbial populations were examined by culture, quantitative PCR (qPCR), microbiome 16S rRNA gene sequencing, and shotgun metagenomics. Propidium monoazide (PMA)-treated samples identified the viable/intact microbial population of the habitat. The cultivable microbial population ranged from below the detection limit to 106 CFU/sample, and their identity was characterized using Sanger sequencing. Both 16S rRNA amplicon and shotgun sequencing were used to characterize the microbial dynamics, community profiles, and functional attributes (metabolism, virulence, and antimicrobial resistance). The 16S rRNA amplicon sequencing revealed abundance of viable (after PMA treatment) Actinobacteria (Brevibacterium, Nesternkonia, Mycobacterium, Pseudonocardia, and Corynebacterium), Firmicutes (Virgibacillus, Staphylococcus, and Oceanobacillus), and Proteobacteria (especially Acinetobacter) on linoleum, dry wall, and particle board (LDP) surfaces, while members of Firmicutes (Leuconostocaceae) and Proteobacteria (Enterobacteriaceae) were high on the glass/metal surfaces. Nonmetric multidimensional scaling determined from both 16S rRNA and metagenomic analyses revealed differential microbial species on LDP surfaces and glass/metal surfaces. The shotgun metagenomic sequencing of samples after PMA treatment showed bacterial predominance of viable Brevibacterium (53.6%), Brachybacterium (7.8%), Pseudonocardia (9.9%), Mycobacterium (3.7%), and Staphylococcus (2.1%), while fungal analyses revealed Aspergillus and Penicillium dominance. IMPORTANCE This study provides the first assessment of monitoring cultivable and viable microorganisms on surfaces within a submerged, closed, analog habitat. The results of the analyses presented herein suggest that the surface material plays a role in microbial community structure, as the microbial populations differed between LDP and metal/glass surfaces. The metal/glass surfaces had less-complex community, lower bioburden, and more closely resembled the controls. These results indicated that material choice is crucial when building closed habitats, even if they are simply analogs. Finally, while a few species were associated with previously cultivated isolates from the International Space Station and MIR spacecraft, the majority of the microbial ecology of the submerged analog habitat differs greatly from that of previously studied analog habitats.


2020 ◽  
Vol 86 (24) ◽  
Author(s):  
Tobin J. Hammer ◽  
Jacob C. Dickerson ◽  
W. Owen McMillan ◽  
Noah Fierer

ABSTRACT Lepidoptera (butterflies and moths) are diverse and ecologically important, yet we know little about how they interact with microbes as adults. Due to metamorphosis, the form and function of their adult-stage microbiomes might be very different from those of microbiomes in the larval stage (caterpillars). We studied adult-stage microbiomes of Heliconius and closely related passion-vine butterflies (Heliconiini), which are an important model system in evolutionary biology. To characterize the structure and dynamics of heliconiine microbiomes, we used field collections of wild butterflies, 16S rRNA gene sequencing, quantitative PCR, and shotgun metagenomics. We found that Heliconius butterflies harbor simple and abundant bacterial communities that are moderately consistent among conspecific individuals and over time. Heliconiine microbiomes also exhibited a strong signal of the host phylogeny, with a major distinction between Heliconius and other butterflies. These patterns were largely driven by differing relative abundances of bacterial phylotypes shared among host species and genera, as opposed to the presence or absence of host-specific phylotypes. We suggest that the phylogenetic structure in heliconiine microbiomes arises from conserved host traits that differentially filter microbes from the environment. While the relative importance of different traits remains unclear, our data indicate that pollen feeding (unique to Heliconius) is not a primary driver. Using shotgun metagenomics, we also discovered trypanosomatids and microsporidia to be prevalent in butterfly guts, raising the possibility of antagonistic interactions between eukaryotic parasites and colocalized gut bacteria. Our discovery of characteristic and phylogenetically structured microbiomes provides a foundation for tests of adult-stage microbiome function, a poorly understood aspect of lepidopteran biology. IMPORTANCE Many insects host microbiomes with important ecological functions. However, the prevalence of this phenomenon is unclear because in many insect taxa, microbiomes have been studied in only part of the life cycle, if at all. A prominent example is butterflies and moths, in which the composition and functional role of adult-stage microbiomes are largely unknown. We comprehensively characterized microbiomes in adult passion-vine butterflies. Butterfly-associated bacterial communities are generally abundant in guts, consistent within populations, and composed of taxa widely shared among hosts. More closely related butterflies harbor more similar microbiomes, with the most dramatic shift in microbiome composition occurring in tandem with a suite of ecological and life history traits unique to the genus Heliconius. Butterflies are also frequently infected with previously undescribed eukaryotic parasites, which may interact with bacteria in important ways. These findings advance our understanding of butterfly biology and insect-microbe interactions generally.


Genes ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 331
Author(s):  
Nachon Raethong ◽  
Massalin Nakphaichit ◽  
Narissara Suratannon ◽  
Witida Sathitkowitchai ◽  
Wanlapa Weerapakorn ◽  
...  

The gut microbiome plays a major role in the maintenance of human health. Characterizing the taxonomy and metabolic functions of the human gut microbiome is necessary for enhancing health. Here, we analyzed the metagenomic sequencing, assembly and construction of a meta-gene catalogue of the human gut microbiome with the overall aim of investigating the taxonomy and metabolic functions of the gut microbiome in Thai adults. As a result, the integrative analysis of 16S rRNA gene and whole metagenome shotgun (WMGS) sequencing data revealed that the dominant gut bacterial families were Lachnospiraceae and Ruminococcaceae of the Firmicutes phylum. Consistently, across 3.8 million (M) genes annotated from 163.5 gigabases (Gb) of WMGS sequencing data, a significant number of genes associated with carbohydrate metabolism of the dominant bacterial families were identified. Further identification of bacterial community-wide metabolic functions promisingly highlighted the importance of Roseburia and Faecalibacterium involvement in central carbon metabolism, sugar utilization and metabolism towards butyrate biosynthesis. This work presents an initial study of shotgun metagenomics in a Thai population-based cohort in a developing Southeast Asian country.


2017 ◽  
Author(s):  
Hannah Holland-Moritz ◽  
Julia Stuart ◽  
Lily R. Lewis ◽  
Samantha Miller ◽  
Michelle C. Mack ◽  
...  

AbstractMosses are critical components of boreal ecosystems where they typically account for a large proportion of net primary productivity and harbor diverse bacterial communities that can be the major source of biologically-fixed nitrogen in these ecosystems. Despite their ecological importance, we have limited understanding of how microbial communities vary across boreal moss species and the extent to which local environmental conditions may influence the composition of these bacterial communities. We used marker gene sequencing to analyze bacterial communities associated with eight boreal moss species collected near Fairbanks, AK USA. We found that host identity was more important than site in determining bacterial community composition and that mosses harbor diverse lineages of potential N2- fixers as well as an abundance of novel taxa assigned to understudied bacterial phyla (including candidate phylum WPS-2). We performed shotgun metagenomic sequencing to assemble genomes from the WPS-2 candidate phylum and found that these moss-associated bacteria are likely anoxygenic phototrophs capable of carbon fixation via RuBisCo with an ability to utilize byproducts of photorespiration from hosts via a glyoxylate shunt. These results give new insights into the metabolic capabilities of understudied bacterial lineages that associate with mosses and the importance of plant hosts in shaping their microbiomes.


2020 ◽  
Author(s):  
Megan Sarah Beaudry ◽  
Jincheng Wang ◽  
Troy Kieran ◽  
Jesse Thomas ◽  
Natalia Juliana Bayona-Vasquez ◽  
...  

Environmental microbial diversity is often investigated from a molecular perspective using 16S ribosomal RNA (rRNA) gene amplicons and shotgun metagenomics. While amplicon methods are fast, low-cost, and have curated reference databases, they can suffer from amplification bias and are limited in genomic scope. In contrast, shotgun metagenomic methods sample more genomic regions with fewer sequence acquisition biases. However, shotgun metagenomic sequencing is much more expensive (even with moderate sequencing depth) and computationally challenging. Here, we develop a set of 16S rRNA sequence capture baits that offer a potential middle ground with the advantages from both approaches for investigating microbial communities. These baits cover the diversity of all 16S rRNA sequences available in the Greengenes (v. 13.5) database, with no sequence having < 80% sequence similarity to at least one bait for all segments of 16S. The use of our baits provide comparable results to 16S amplicon libraries and shotgun metagenomic libraries when assigning taxonomic units from 16S sequences within the metagenomic reads. We demonstrate that 16S rRNA capture baits can be used on a range of microbial samples (i.e., mock communities and rodent fecal samples) to increase the proportion of 16S rRNA sequences (average >400-fold) and decrease analysis time to obtain consistent community assessments. Furthermore, our study reveals that bioinformatic methods used to analyze sequencing data may have a greater influence on estimates of community composition than library preparation method used, likely in part to the extent and curation of the reference databases considered.


2020 ◽  
Author(s):  
Po-E Li ◽  
Joseph A. Russell ◽  
David Yarmosh ◽  
Alan G. Shteyman ◽  
Kyle Parker ◽  
...  

ABSTRACTMetagenomics is emerging as an important tool in biosurveillance, public health, and clinical applications. However, ease-of-use for execution and data analysis remains a barrier-of-entry to the adoption of metagenomics in applied health and forensics settings. In addition, these venues often have more stringent requirements for reporting, accuracy, and precision than the traditional ecological research role of the technology. Here, we present PanGIA (Pan-Genomics for Infectious Agents), a novel bioinformatics analysis platform for hosting, processing, analyzing, and reporting shotgun metagenomics data of complex samples suspected of containing one or more pathogens. PanGIA was developed to address gaps that often preclude clinicians, medical technicians, forensics personnel, or other non-expert end-users from the routine application of metagenomics for pathogen identification. Though primarily designed to detect pathogenic microorganisms within clinical and environmental metagenomics data, PanGIA also serves as an analytical framework for microbial community profiling and comparative metagenomics. To provide statistical confidence in PanGIA’s taxonomic assignments, the system provides two independent estimations of probability for species and strain level detection. First, PanGIA integrates coverage data with ‘uniqueness’ information mapped across each reference genome for a stand-alone determination of confidence for each query sequence at each taxonomy level. Second, if a negative-control sample is provided, PanGIA compares this sample with a corresponding experimental unknown sample and determines a measure of confidence associated with ‘detection above background’. An integrated graphical user interface allows interactive interrogation and enables users to summarize multiple sample results by confidence score, normalized read abundance, reference genome linear coverage, depth-of-coverage, RPKM, and other metrics to detect specific organisms-of-interest. Comparison testing of the PanGIA algorithm against a number of recent k-mer, read-mapping, and marker-gene based taxonomy classifiers across various real-world datasets with spiked targets shows superior mean positive predictive value, sensitivity, and specificity. PanGIA can process a five million paired-end read dataset in under 1 hour on commodity computational hardware. The source code and documentation are publicly available at https://github.com/LANL-Bioinformatics/PanGIA or https://github.com/mriglobal/PanGIA. The database for PanGIA can be downloaded from ftp://bioinformatics.mriglobal.org/. The full GUI-based PanGIA analysis environment is available in a Docker container and can be installed from https://hub.docker.com/r/poeli/pangia/.


2019 ◽  
Author(s):  
Ezequiel Santillan ◽  
Hari Seshan ◽  
Florentin Constancias ◽  
Stefan Wuertz

SummaryTrait-based approaches are increasingly gaining importance in community ecology, as a way of finding general rules for the mechanisms driving changes in community structure and function under the influence of perturbations. Frameworks for life-history strategies have been successfully applied to describe changes in plant and animal communities upon disturbance. To evaluate their applicability to complex bacterial communities, we operated replicated wastewater treatment bioreactors for 35 days and subjected them to eight different disturbance frequencies of a toxic pollutant (3-chloroaniline), starting with a mixed inoculum from a full-scale treatment plant. Relevant ecosystem functions were tracked and microbial communities assessed through metagenomics and 16S rRNA gene sequencing. Combining a series of ordination, statistical and network analysis methods, we associated different life-history strategies with microbial communities across the disturbance range. These strategies were evaluated using tradeoffs in community function and genotypic potential, and changes in bacterial genus composition. We further compared our findings with other ecological studies and adopted a semi-quantitative CSR (competitors, ruderals, stress-tolerants) classification. The framework reduces complex datasets of microbial traits, functions, and taxa into ecologically meaningful components to help understand the system response to disturbance, and hence represents a promising tool for managing microbial communities.Originality-Significance StatementThis study establishes, for the first time, CSR life-history strategies in the context of bacterial communities. This framework is explained using community aggregated traits in an environment other than soil, also a first, using a combination of ordination methods, network analysis, and genotypic information from shotgun metagenomics and 16S rRNA gene amplicon sequencing.


2020 ◽  
Vol 48 (16) ◽  
pp. e93-e93
Author(s):  
Anna Tovo ◽  
Peter Menzel ◽  
Anders Krogh ◽  
Marco Cosentino Lagomarsino ◽  
Samir Suweis

Abstract Characterizing species diversity and composition of bacteria hosted by biota is revolutionizing our understanding of the role of symbiotic interactions in ecosystems. Determining microbiomes diversity implies the assignment of individual reads to taxa by comparison to reference databases. Although computational methods aimed at identifying the microbe(s) taxa are available, it is well known that inferences using different methods can vary widely depending on various biases. In this study, we first apply and compare different bioinformatics methods based on 16S ribosomal RNA gene and shotgun sequencing to three mock communities of bacteria, of which the compositions are known. We show that none of these methods can infer both the true number of taxa and their abundances. We thus propose a novel approach, named Core-Kaiju, which combines the power of shotgun metagenomics data with a more focused marker gene classification method similar to 16S, but based on emergent statistics of core protein domain families. We thus test the proposed method on various mock communities and we show that Core-Kaiju reliably predicts both number of taxa and abundances. Finally, we apply our method on human gut samples, showing how Core-Kaiju may give more accurate ecological characterization and a fresh view on real microbiomes.


2022 ◽  
Vol 12 ◽  
Author(s):  
Antonia Cristi ◽  
Génesis Parada-Pozo ◽  
Felipe Morales-Vicencio ◽  
César A. Cárdenas ◽  
Nicole Trefault

Sponge-associated microorganisms are essential for sponge survival. They play an important role in recycling nutrients and, therefore, in the maintenance of the ecosystem. These microorganisms are diverse, species-specific, and different from those in the surrounding seawater. Bacterial sponge symbionts have been extensively studied in the tropics; however, little is known about these microorganisms in sponges from high-latitude environments. Sponges can cover up to 80% of the benthos in Antarctica and are crucial architects for the marine food web. In this study, we present analyses of the bacterial symbionts of three sponges: Haliclona (Rhizoniera) sp., Hymeniacidon torquata, and Isodictya kerguelenensis from the Western Antarctic Peninsula (WAP) with the aim to determine variations on the specificity of the bacteria–sponge interactions and potential signatures on their predicted functional profiles. We use high-throughput 16S rRNA gene sequencing of 30 sponge individuals inhabiting South Bay (Palmer Archipelago, WAP) to describe their microbiome taxonomy and diversity and predict potential functional profiles based on this marker gene. Our work shows similar bacterial community composition profiles among the same sponge species, although the symbiotic relationship is not equally conserved among the three Antarctic sponges. The number of species-specific core operational taxonomic units (OTUs) of these Antarctic sponges was low, with important differences between the total abundance accounted for these OTUs. Only eight OTUs were shared between the three sponge species. Analyses of the functional potential revealed that despite the high host–symbiont specificity, the inferred functions are conserved among these microbiomes, although with differences in the abundance of specific functions. H. torquata showed the highest level of intra-specificity and a higher potential of pathways related to energy metabolism, metabolisms of terpenoids and polyketides, and biosynthesis of other secondary metabolites. Overall, this work shows variations in the specificity of the sponge-associated bacterial communities, differences in how hosts and symbionts establish their relations, and in their potential functional capabilities.


Sign in / Sign up

Export Citation Format

Share Document