Combining whole genome shotgun sequencing and rDNA amplicon analyses to improve detection of microbe-microbe interaction networks in plant leaves

AbstractMicroorganisms from all domains of life establish associations with plants. Although some harm the plant, others antagonize pathogens or prime the plant immune system, acquire nutrients, tune plant hormone levels, or perform additional services. Most culture-independent plant microbiome research has focused on amplicon sequencing of 16S rDNA and/or the internal transcribed spacer (ITS) of rDNA loci, but the decreasing cost of high-throughput sequencing has made shotgun metagenome sequencing increasingly accessible. Here, we describe shotgun sequencing of 275 wild Arabidopsis thaliana leaf microbiomes from southwest Germany, with additional bacterial 16S rDNA and eukaryotic ITS1 amplicon data from 176 of these samples. The shotgun data were dominated by bacterial sequences, with eukaryotes contributing only a minority of reads. For shotgun and amplicon data, microbial membership showed weak associations with both site of origin and plant genotype, both of which were highly confounded in this dataset. There was large variation among microbiomes, with one extreme comprising samples of low complexity and a high load of microorganisms typical of infected plants, and the other extreme being samples of high complexity and a low microbial load. We use the metagenome data, which captures the ratio of bacterial to plant DNA in leaves of wild plants, to scale the 16S rDNA amplicon data such that they reflect absolute bacterial abundance. We show that this cost-effective hybrid strategy overcomes compositionality problems in amplicon data and leads to fundamentally different conclusions about microbiome community assembly.

Download Full-text

Reduced metagenome sequencing for strain-resolution taxonomic profiles

Microbiome ◽

10.1186/s40168-021-01019-8 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Lars Snipen ◽

Inga-Leena Angell ◽

Torbjørn Rognes ◽

Knut Rudi

Keyword(s):

Microbial Community ◽

Microbial Community Composition ◽

Amplicon Sequencing ◽

Good Alternative ◽

Shotgun Sequencing ◽

Least Square ◽

Full Potential ◽

Shotgun Metagenomics ◽

Metagenome Sequencing ◽

Reference Genomes

Abstract Background Studies of shifts in microbial community composition has many applications. For studies at species or subspecies levels, the 16S amplicon sequencing lacks resolution and is often replaced by full shotgun sequencing. Due to higher costs, this restricts the number of samples sequenced. As an alternative to a full shotgun sequencing we have investigated the use of Reduced Metagenome Sequencing (RMS) to estimate the composition of a microbial community. This involves the use of double-digested restriction-associated DNA sequencing, which means only a smaller fraction of the genomes are sequenced. The read sets obtained by this approach have properties different from both amplicon and shotgun data, and analysis pipelines for both can either not be used at all or not explore the full potential of RMS data. Results We suggest a procedure for analyzing such data, based on fragment clustering and the use of a constrained ordinary least square de-convolution for estimating the relative abundance of all community members. Mock community datasets show the potential to clearly separate strains even when the 16S is 100% identical, and genome-wide differences is < 0.02, indicating RMS has a very high resolution. From a simulation study, we compare RMS to shotgun sequencing and show that we get improved abundance estimates when the community has many very closely related genomes. From a real dataset of infant guts, we show that RMS is capable of detecting a strain diversity gradient for Escherichia coli across time. Conclusion We find that RMS is a good alternative to either metabarcoding or shotgun sequencing when it comes to resolving microbial communities at the strain level. Like shotgun metagenomics, it requires a good database of reference genomes and is well suited for studies of the human gut or other communities where many reference genomes exist. A data analysis pipeline is offered, as an R package at https://github.com/larssnip/microRMS.

Download Full-text

Synthetic Sequencing Standards: A Guide to Database Choice for Rumen Microbiota Amplicon Sequencing Analysis

Frontiers in Microbiology ◽

10.3389/fmicb.2020.606825 ◽

2020 ◽

Vol 11 ◽

Author(s):

Paul E. Smith ◽

Sinead M. Waters ◽

Ruth Gómez Expósito ◽

Hauke Smidt ◽

Ciara A. Carberry ◽

...

Keyword(s):

High Throughput Sequencing ◽

Cost Effective ◽

Amplicon Sequencing ◽

Gas Production ◽

Reference Database ◽

Specific Reference ◽

Sequencing Analysis ◽

Sequencing Data ◽

Rumen Microbiota ◽

Reference Databases

Our understanding of complex microbial communities, such as those residing in the rumen, has drastically advanced through the use of high throughput sequencing (HTS) technologies. Indeed, with the use of barcoded amplicon sequencing, it is now cost effective and computationally feasible to identify individual rumen microbial genera associated with ruminant livestock nutrition, genetics, performance and greenhouse gas production. However, across all disciplines of microbial ecology, there is currently little reporting of the use of internal controls for validating HTS results. Furthermore, there is little consensus of the most appropriate reference database for analyzing rumen microbiota amplicon sequencing data. Therefore, in this study, a synthetic rumen-specific sequencing standard was used to assess the effects of database choice on results obtained from rumen microbial amplicon sequencing. Four DADA2 reference training sets (RDP, SILVA, GTDB, and RefSeq + RDP) were compared to assess their ability to correctly classify sequences included in the rumen-specific sequencing standard. In addition, two thresholds of phylogenetic bootstrapping, 50 and 80, were applied to investigate the effect of increasing stringency. Sequence classification differences were apparent amongst the databases. For example the classification of Clostridium differed between all databases, thus highlighting the need for a consistent approach to nomenclature amongst different reference databases. It is hoped the effect of database on taxonomic classification observed in this study, will encourage research groups across various microbial disciplines to develop and routinely use their own microbiome-specific reference standard to validate analysis pipelines and database choice.

Download Full-text

Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies

PeerJ ◽

10.7717/peerj.1839 ◽

2016 ◽

Vol 4 ◽

pp. e1839 ◽

Cited By ~ 57

Author(s):

Tom O. Delmont ◽

A. Murat Eren

Keyword(s):

High Throughput Sequencing ◽

Draft Genome ◽

Cost Effective ◽

Single Copy ◽

Eukaryotic Genome ◽

Sequencing Data ◽

Bacterial Genomes ◽

Long Read ◽

Domains Of Life ◽

Genome Assemblies

High-throughput sequencing provides a fast and cost-effective mean to recover genomes of organisms from all domains of life. However, adequate curation of the assembly results against potential contamination of non-target organisms requires advanced bioinformatics approaches and practices. Here, we re-analyzed the sequencing data generated for the tardigradeHypsibius dujardini,and created a holistic display of the eukaryotic genome assembly using DNA data originating from two groups and eleven sequencing libraries. By using bacterial single-copy genes, k-mer frequencies, and coverage values of scaffolds we could identify and characterize multiple near-complete bacterial genomes from the raw assembly, and curate a 182 Mbp draft genome forH. dujardinisupported by RNA-Seq data. Our results indicate that most contaminant scaffolds were assembled from Moleculo long-read libraries, and most of these contaminants have differed between library preparations. Our re-analysis shows that visualization and curation of eukaryotic genome assemblies can benefit from tools designed to address the needs of today’s microbiologists, who are constantly challenged by the difficulties associated with the identification of distinct microbial genomes in complex environmental metagenomes.

Download Full-text

Handling of targeted amplicon sequencing data focusing on index hopping and demultiplexing using a nested metabarcoding approach in ecology

Scientific Reports ◽

10.1038/s41598-021-98018-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yasemin Guenay-Greunke ◽

David A. Bohan ◽

Michael Traugott ◽

Corinna Wallinger

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Cost Effective ◽

Amplicon Sequencing ◽

Sequencing Depth ◽

Sequencing Error ◽

Sequencing Data ◽

Large Sample ◽

Sequencing Errors ◽

Plant Feeding

AbstractHigh-throughput sequencing platforms are increasingly being used for targeted amplicon sequencing because they enable cost-effective sequencing of large sample sets. For meaningful interpretation of targeted amplicon sequencing data and comparison between studies, it is critical that bioinformatic analyses do not introduce artefacts and rely on detailed protocols to ensure that all methods are properly performed and documented. The analysis of large sample sets and the use of predefined indexes create challenges, such as adjusting the sequencing depth across samples and taking sequencing errors or index hopping into account. However, the potential biases these factors introduce to high-throughput amplicon sequencing data sets and how they may be overcome have rarely been addressed. On the example of a nested metabarcoding analysis of 1920 carabid beetle regurgitates to assess plant feeding, we investigated: (i) the variation in sequencing depth of individually tagged samples and the effect of library preparation on the data output; (ii) the influence of sequencing errors within index regions and its consequences for demultiplexing; and (iii) the effect of index hopping. Our results demonstrate that despite library quantification, large variation in read counts and sequencing depth occurred among samples and that the sequencing error rate in bioinformatic software is essential for accurate adapter/primer trimming and demultiplexing. Moreover, setting an index hopping threshold to avoid incorrect assignment of samples is highly recommended.

Download Full-text

Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies

10.7287/peerj.preprints.1695 ◽

2016 ◽

Author(s):

Tom O Delmont ◽

A. Murat Eren

Keyword(s):

High Throughput Sequencing ◽

Draft Genome ◽

Cost Effective ◽

Single Copy ◽

Eukaryotic Genome ◽

Metagenomic Data ◽

Sequencing Data ◽

Long Read ◽

Domains Of Life ◽

Genome Assemblies

High-throughput sequencing provides a fast and cost effective mean to recover genomes of organisms from all domains of life. However, adequate curation of the assembly results against potential contamination of non-target organisms requires advanced bioinformatics approaches and practices. Here, we re-analyzed the sequencing data generated for the tardigrade Hypsibius dujardini using approaches routinely employed by microbial ecologists who reconstruct bacterial and archaeal genomes from metagenomic data. We created a holistic display of the eukaryotic genome assembly using DNA data originating from two groups and eleven sequencing libraries. By using bacterial single-copy genes, k-mer frequencies, and coverage values of scaffolds we could identify and characterize multiple near-complete bacterial genomes, and curate a 182 Mbp draft genome for H. dujardini supported by RNA-Seq data. Our results indicate that most contaminant scaffolds were assembled from Moleculo long-read libraries, and most of these contaminants have differed between library preparations. Our re-analysis shows that visualization and curation of eukaryotic genome assemblies can benefit from tools designed to address the needs of today’s microbiologists, who are constantly challenged by the difficulties associated with the identification of distinct microbial genomes in complex environmental metagenomes.

Download Full-text

Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies

10.7287/peerj.preprints.1695v1 ◽

2016 ◽

Cited By ~ 1

Author(s):

Tom O Delmont ◽

A. Murat Eren

Keyword(s):

High Throughput Sequencing ◽

Draft Genome ◽

Cost Effective ◽

Single Copy ◽

Eukaryotic Genome ◽

Metagenomic Data ◽

Sequencing Data ◽

Long Read ◽

Domains Of Life ◽

Genome Assemblies

Download Full-text

Consequences of organ choice in describing bacterial pathogen assemblages in a rodent population

Epidemiology and Infection ◽

10.1017/s0950268817001893 ◽

2017 ◽

Vol 145 (14) ◽

pp. 3070-3075 ◽

Cited By ~ 3

Author(s):

P. VILLETTE ◽

E. AFONSO ◽

G. COUVAL ◽

A. LEVRET ◽

M. GALAN ◽

...

Keyword(s):

High Throughput Sequencing ◽

Bacterial Pathogens ◽

Bacterial Pathogen ◽

Cost Effective ◽

Operational Taxonomic Unit ◽

Amplicon Sequencing ◽

Bacterial Assemblage ◽

Sequencing Technologies ◽

16S Rrna Amplicon Sequencing ◽

Bacterial Richness

SUMMARYHigh-throughput sequencing technologies now allow for rapid cost-effective surveys of multiple pathogens in many host species including rodents, but it is currently unclear if the organ chosen for screening influences the number and identity of bacteria detected. We used 16S rRNA amplicon sequencing to identify bacterial pathogens in the heart, liver, lungs, kidneys and spleen of 13 water voles (Arvicola terrestris) collected in Franche-Comté, France. We asked if bacterial pathogen assemblages within organs are similar and if all five organs are necessary to detect all of the bacteria present in an individual animal. We identified 24 bacteria representing 17 genera; average bacterial richness for each organ ranged from 1·5 ± 0·4 (mean ± standard error) to 2·5 ± 0·4 bacteria/organ and did not differ significantly between organs. The average bacterial richness when organ assemblages were pooled within animals was 4·7 ± 0·6 bacteria/animal; Operational Taxonomic Unit accumulation analysis indicates that all five organs are required to obtain this. Organ type influences bacterial assemblage composition in a systematic way (PERMANOVA, 999 permutations, pseudo-F4,51 = 1·37, P = 0·001). Our results demonstrate that the number of organs sampled influences the ability to detect bacterial pathogens, which can inform sampling decisions in public health and wildlife ecology.

Download Full-text

CaptureSeq: Hybridization-based enrichment of cpn60 gene fragments reveals the community structures of synthetic and natural microbial ecosystems

10.1101/492116 ◽

2018 ◽

Author(s):

Matthew G. Links ◽

Tim J. Dumonceaux ◽

Luke McCarthy ◽

Sean M. Hemmingsen ◽

Edward Topp ◽

...

Keyword(s):

Microbial Community ◽

Microbial Communities ◽

Shotgun Sequencing ◽

Natural Ecosystems ◽

Gene Markers ◽

Microbiome Composition ◽

Microbial Profile ◽

Domains Of Life ◽

Metagenome Sequencing ◽

Pcr Targeting

AbstractBackgroundMolecular profiling of complex microbial communities has become the basis for examining the relationship between the microbiome composition, structure and metabolic functions of those communities. Microbial community structure can be partially assessed with universal PCR targeting taxonomic or functional gene markers. Increasingly, shotgun metagenomic DNA sequencing is providing more quantitative insight into microbiomes. However, both amplicon-based and shotgun sequencing approaches have shortcomings that limit the ability to study microbiome dynamics.MethodsWe present a novel, amplicon-free, hybridization-based method (CaptureSeq) for profiling complex microbial communities using probes based on the chaperonin-60 gene. Molecular profiles of a commercially available synthetic microbial community standard were compared using CaptureSeq, whole metagenome sequencing, and 16S universal target amplification. Profiles were also generated for natural ecosystems including antibiotic-amended soils, manure storage tanks, and an agricultural reservoir.ResultsThe CaptureSeq method generated a microbial profile that encompassed all of the bacteria and eukaryotes in the panel with greater reproducibility and more accurate representation of high G/C content microorganisms compared to 16S amplification. In the natural ecosystems, CaptureSeq provided a much greater depth of coverage and sensitivity of detection compared to shotgun sequencing without prior selection. The resulting community profiles provided quantitatively reliable information about all three Domains of life (Bacteria, Archaea, and Eukarya) in the different ecosystems. The applications of CaptureSeq will facilitate accurate studies of host-microbiome interactions for environmental, crop, animal and human health.

Download Full-text

Rapid and cost-effective generation of single specimen multilocus barcoding data from whole arthropod communities by multiple levels of multiplexing

Scientific Reports ◽

10.1038/s41598-019-54927-z ◽

2020 ◽

Vol 10 (1) ◽

Cited By ~ 2

Author(s):

Guillemette A. de Kerdrel ◽

Jeremy C. Andersen ◽

Susan R. Kennedy ◽

Rosemary Gillespie ◽

Henrik Krehenwinkel

Keyword(s):

High Throughput Sequencing ◽

Cost Effective ◽

Amplicon Sequencing ◽

Sequencing Technology ◽

Molecular Barcoding ◽

Biodiversity Crisis ◽

Effective Generation ◽

Trapping Methods ◽

Multiple Levels ◽

Illumina Amplicon Sequencing

AbstractIn light of the current biodiversity crisis, molecular barcoding has developed into an irreplaceable tool. Barcoding has been considerably simplified by developments in high throughput sequencing technology, but still can be prohibitively expensive and laborious when community samples of thousands of specimens need to be processed. Here, we outline an Illumina amplicon sequencing approach to generate multilocus data from large collections of arthropods. We reduce cost and effort up to 50-fold, by combining multiplex PCRs and DNA extractions from pools of presorted and morphotyped specimens and using two levels of sample indexing. We test our protocol by generating a comprehensive, community wide dataset of barcode sequences for several thousand Hawaiian arthropods from 14 orders, which were collected across the archipelago using various trapping methods. We explore patterns of diversity across the Archipelago and compare the utility of different arthropod trapping methods for biodiversity explorations on Hawaii, highlighting undergrowth beating as highly efficient method. Moreover, we show the effects of barcode marker, taxonomy and relative biomass of the targeted specimens and sequencing coverage on taxon recovery. Our protocol enables rapid and inexpensive explorations of diversity patterns and the generation of multilocus barcode reference libraries across whole ecosystems.

Download Full-text

Host-associated microbe PCR (hamPCR): accessing new biology through convenient measurement of both microbial load and community composition

10.1101/2020.05.19.103937 ◽

2020 ◽

Cited By ~ 1

Author(s):

Derek S. Lundberg ◽

Pratchaya Pramoj Na Ayutthaya ◽

Annett Strauß ◽

Gautam Shirsekar ◽

Wen-Sui Lo ◽

...

Keyword(s):

Microbial Community ◽

Community Composition ◽

Compositional Data ◽

Microbial Community Composition ◽

Cost Effective ◽

Amplicon Sequencing ◽

Microbial Load ◽

Rrna Gene ◽

Metagenome Sequencing ◽

Transformative Approach

AbstractThe ratio of microbial population size relative to the amount of host tissue, or “microbial load”, is a fundamental metric of colonization and infection, but it cannot be directly deduced from microbial amplicon data such as 16S rRNA gene counts. Because conventional methods to determine load, such as serial dilution plating or quantitative PCR, add substantial experimental burden, they are only rarely paired with amplicon sequencing. Alternatively, whole metagenome sequencing of DNA contributed by host and microbes both reveals microbial community composition and enables determination of microbial load, but host DNA typically greatly outweighs microbial DNA, severely limiting the cost-effectiveness and scalability of this approach. We introduce host-associated microbe PCR (hamPCR), a robust amplicon sequencing strategy to quantify microbial load and describe interkingdom microbial community composition in a single, cost-effective library. We demonstrate its accuracy and flexibility across multiple host and microbe systems, including nematodes and major crops. We further present a technique that can be used, prior to sequencing, to optimize the host representation in a batch of libraries without loss of information. Because of its simplicity, and the fact that it provides an experimental solution to the well-known statistical challenges provided by compositional data, hamPCR will become a transformative approach throughout culture-independent microbiology.

Download Full-text