Studying ecosystems with DNA metabarcoding: lessons from aquatic biomonitoring

AbstractAn ongoing challenge for ecological studies has been the collection of data with high precision and accuracy at a sufficient scale to detect effects relevant to management of critical global change processes. A major hurdle for many workflows has been the time-consuming and challenging process of sorting and identification of organisms, but the rapid development of DNA metabarcoding as a biodiversity observation tool provides a potential solution. As high-throughput sequencing becomes more rapid and cost-effective, a ‘big data’ revolution is anticipated, based on higher and more accurate taxonomic resolution, more efficient detection, and greater sample processing capacity. These advances have the potential to amplify the power of ecological studies to detect change and diagnose its cause, through a methodology termed ‘Biomonitoring 2.0’.Despite its promise, the unfamiliar terminology and pace of development in high-throughput sequencing technologies has contributed to a growing concern that an unproven technology is supplanting tried and tested approaches, lowering trust among potential users, and reducing uptake by ecologists and environmental management practitioners. While it is reasonable to exercise caution, we argue that any criticism of new methods must also acknowledge the shortcomings and lower capacity of current observation methods. Broader understanding of the statistical properties of metabarcoding data will help ecologists to design, test and review evidence for new hypotheses.We highlight the uncertainties and challenges underlying DNA metabarcoding and traditional methods for compositional analysis, focusing on issues of taxonomic resolution, sample similarity, taxon misidentification, sample contamination, and taxon abundance. Using the example of freshwater benthic ecosystems, one of the most widely-applied non-microbial applications of DNA metabarcoding to date, we explore the ability of this new technology to improve the quality and utility of ecological data, recognising that the issues raised have widespread applicability across all ecosystem types.

Download Full-text

The use of eDNA and DNA metabarcoding in monitoring the ecological condition of Norwegian lakes

ARPHA Conference Abstracts ◽

10.3897/aca.4.e65309 ◽

2021 ◽

Vol 4 ◽

Author(s):

Sara Atienza Casas ◽

Markus Majaneva ◽

Thomas Jensen ◽

Marie Davey ◽

Frode Fossøy ◽

...

Keyword(s):

High Throughput Sequencing ◽

Ecological Status ◽

Ecological Condition ◽

Molecular Techniques ◽

Taxonomic Resolution ◽

Potential Cost ◽

Running Water ◽

National Monitoring ◽

Monitoring Programs ◽

Dna Metabarcoding

Biodiversity assessments using molecular identification of organisms through high-throughput sequencing techniques have been a game changer in ecosystem monitoring, providing increased taxonomic resolution, more objective identifications, potential cost reductions, and reduced processing times. The use of DNA metabarcoding of bulk samples and environmental DNA (eDNA) is now widespread but is not yet universally implemented in national monitoring programs. While bulk sample metabarcoding involves extraction of DNA from organisms in a sample, eDNA analysis involves obtaining DNA directly from environmental samples, which can include microorganisms, meiofauna-size taxa and macrofauna traces such as larval stages, skin and hair cells, gametes, faeces and free DNA bound to particles. In Norway, freshwater biomonitoring in compliance with the EU Water Framework Directive (WFD) is conducted on several administrative levels, including national monitoring programs for running water, small and large lakes. These programs typically focus on a fraction of the actual biodiversity present in the monitored habitats (Weigand 2019). DNA metabarcoding of both bulk samples and eDNA samples are relevant tools for future freshwater biomonitoring in Norway. The aim of this PhD project is to develop assessment protocols based on DNA-metabarcoding and eDNA of benthic invertebrates, microcrustaceans and fish that can be used as standard biomonitoring tools to assess the ecological condition of lakes. The main topics addressed will be: - Development of protocols throughout the eDNA-metabarcoding workflow (i.e. sampling, filtration, preservation, extraction, amplification and sequencing) suitable to execute biodiversity assessments and determine the ecological status of lakes. - Comparison of the results obtained using molecular tools and traditional morphology-based approaches in order to assess the feasibility of such techniques to be incorporated as standard biomonitoring tools, such as the ones implemented under the provisions of the WFD. - Evaluate the effect of improved taxonomic resolution from molecular techniques on determining the ecological status of lakes, both by broadening the number of taxa analyzed and by identifying more taxa to species level. - Assess the feasibility of using eDNA extracted from water samples, taken at different depths and fish densities, to measure fish abundance/biomass as a proxy to calculate the ecological quality indices regulated in the WFD. - Analyze the coverage and resolution provided by reference libraries for certain taxa, such as crustacea, in order to assess the reliability and precision of taxonomic assignments.

Download Full-text

DNA-Based Herbal Teas’ Authentication: An ITS2 and psbA-trnH Multi-Marker DNA Metabarcoding Approach

Plants ◽

10.3390/plants10102120 ◽

2021 ◽

Vol 10 (10) ◽

pp. 2120

Author(s):

Jessica Frigerio ◽

Giulia Agostinetto ◽

Valerio Mezzasalma ◽

Fabrizio De De Mattia ◽

Massimo Labra ◽

...

Keyword(s):

Quality Control ◽

Quantitative Analysis ◽

Medicinal Plants ◽

High Throughput ◽

High Throughput Sequencing ◽

The Other ◽

Plant Component ◽

Identification Rate ◽

Dna Metabarcoding ◽

Therapeutic Properties

Medicinal plants have been widely used in traditional medicine due to their therapeutic properties. Although they are mostly used as herbal infusion and tincture, employment as ingredients of food supplements is increasing. However, fraud and adulteration are widespread issues. In our study, we aimed at evaluating DNA metabarcoding as a tool to identify product composition. In order to accomplish this, we analyzed fifteen commercial products with DNA metabarcoding, using two barcode regions: psbA-trnH and ITS2. Results showed that on average, 70% (44–100) of the declared ingredients have been identified. The ITS2 marker appears to identify more species (n = 60) than psbA-trnH (n = 35), with an ingredients’ identification rate of 52% versus 45%, respectively. Some species are identified only by one marker rather than the other. Additionally, in order to evaluate the quantitative ability of high-throughput sequencing (HTS) to compare the plant component to the corresponding assigned sequences, in the laboratory, we created six mock mixtures of plants starting both from biomass and gDNA. Our analysis also supports the application of DNA metabarcoding for a relative quantitative analysis. These results move towards the application of HTS analysis for studying the composition of herbal teas for medicinal plants’ traceability and quality control.

Download Full-text

HTSeq - A Python framework to work with high-throughput sequencing data

10.1101/002824 ◽

2014 ◽

Cited By ~ 242

Author(s):

Simon Anders ◽

Paul Theodor Pyl ◽

Wolfgang Huber

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Rapid Development ◽

Differential Expression Analysis ◽

Rna Seq ◽

Sequencing Data ◽

Standard Work ◽

Data Formats ◽

High Throughput Sequencing Data ◽

Python Package

Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard work flows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data such as genomic coordinates, sequences, sequencing reads, alignments, gene model information, variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability: HTSeq is released as open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index, https://pypi.python.org/pypi/HTSeq

Download Full-text

PLANiTS: a curated sequence reference dataset for plant ITS DNA metabarcoding

Database ◽

10.1093/database/baz155 ◽

2020 ◽

Vol 2020 ◽

Cited By ~ 6

Author(s):

Elisa Banchi ◽

Claudio G Ametrano ◽

Samuele Greco ◽

David Stanković ◽

Lucia Muggia ◽

...

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Its Region ◽

Computational Effort ◽

Its Sequences ◽

Reference Dataset ◽

Bioinformatic Pipeline ◽

Taxonomic Level ◽

Dna Metabarcoding ◽

Reference Databases

Abstract DNA metabarcoding combines DNA barcoding with high-throughput sequencing to identify different taxa within environmental communities. The ITS has already been proposed and widely used as universal barcode marker for plants, but a comprehensive, updated and accurate reference dataset of plant ITS sequences has not been available so far. Here, we constructed reference datasets of Viridiplantae ITS1, ITS2 and entire ITS sequences including both Chlorophyta and Streptophyta. The sequences were retrieved from NCBI, and the ITS region was extracted. The sequences underwent identity check to remove misidentified records and were clustered at 99% identity to reduce redundancy and computational effort. For this step, we developed a script called ‘better clustering for QIIME’ (bc4q) to ensure that the representative sequences are chosen according to the composition of the cluster at a different taxonomic level. The three datasets obtained with the bc4q script are PLANiTS1 (100 224 sequences), PLANiTS2 (96 771 sequences) and PLANiTS (97 550 sequences), and all are pre-formatted for QIIME, being this the most used bioinformatic pipeline for metabarcoding analysis. Being curated and updated reference databases, PLANiTS1, PLANiTS2 and PLANiTS are proposed as a reliable, pivotal first step for a general standardization of plant DNA metabarcoding studies. The bc4q script is presented as a new tool useful in each research dealing with sequences clustering. Database URL: https://github.com/apallavicini/bc4q; https://github.com/apallavicini/PLANiTS.

Download Full-text

Sorting things out - assessing effects of unequal specimen biomass on DNA metabarcoding

10.7287/peerj.preprints.2561v1 ◽

2016 ◽

Cited By ~ 1

Author(s):

Vasco Elbrecht ◽

Bianca Peinert ◽

Florian Leese

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Sequencing Depth ◽

West Germany ◽

Coi Gene ◽

Mountain Stream ◽

Detection Rates ◽

Size Classes ◽

Dna Metabarcoding ◽

Size Sorting

1) Environmental bulk samples often contain many taxa with biomass differences of several orders of magnitude. This can be problematic in DNA metabarcoding and metagenomic high throughput sequencing approaches, as large specimens contribute over proportionally much DNA template. Thus a few specimens of high biomass will dominate the dataset, potentially leading to smaller specimens remaining undetected. Sorting of samples and balancing the amounts of tissue used per size fraction should improve detection rates, but has not been systematically tested. 2) Here we tested the effects of size sorting on taxa detection using freshwater macroinvertebrates. Kick sampling was performed at two locations of a low-mountain stream in West Germany, specimens were morphologically identified and sorted into small, medium and large size classes (< 2.5x5, 5x10 and up to 10x20 mm). Tissue from the 3 size categories was extracted individually, and pooled to simulate bulk samples that were not sorted and samples which were sorted and then pooled proportionately by specimen size. DNA from all 5 extractions of both samples was amplified using 4 different freshwater primer sets for the COI gene and sequenced on a HiSeq Illumina sequencer. 3) Sorting taxa by size and pooling them proportionately according to their abundance lead to a more equal amplification compared to the processing of complete samples without sorting. The sorted samples recovered 30% more taxa than the unsorted samples, at the same sequencing depth. Our results imply that sequencing depth can be decreased ~ 5 fold when sorting the samples into three size classes. 4) Our results demonstrate that even a coarse size sorting can substantially improve detection rates. While high throughput sequencing will become more accessible and cheaper within the next years, sorting bulk samples by specimen biomass is a simple yet efficient method to reduce current sequencing costs.

Download Full-text

Scaling up DNA metabarcoding for freshwater macrozoobenthos monitoring

10.7287/peerj.preprints.3456v3 ◽

2018 ◽

Author(s):

Vasco Elbrecht ◽

Dirk Steinke

Keyword(s):

High Throughput ◽

Illumina Sequencing ◽

Large Scale ◽

High Throughput Sequencing ◽

Scaling Up ◽

The Past ◽

Dna Metabarcoding ◽

Primer Sets ◽

Fusion Primer ◽

Positive Controls

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.

Download Full-text

Scaling up DNA metabarcoding for freshwater macrozoobenthos monitoring

10.7287/peerj.preprints.3456v2 ◽

2018 ◽

Author(s):

Vasco Elbrecht ◽

Dirk Steinke

Keyword(s):

High Throughput ◽

Illumina Sequencing ◽

Large Scale ◽

High Throughput Sequencing ◽

Scaling Up ◽

The Past ◽

Dna Metabarcoding ◽

Primer Sets ◽

Fusion Primer ◽

Positive Controls

Download Full-text

GenBank is a reliable resource for 21st century biodiversity research

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1911714116 ◽

2019 ◽

Vol 116 (45) ◽

pp. 22651-22656 ◽

Cited By ~ 34

Author(s):

Matthieu Leray ◽

Nancy Knowlton ◽

Shian-Lei Ho ◽

Bryan N. Nguyen ◽

Ryuji J. Machida

Keyword(s):

Phylogenetic Analysis ◽

Global Change ◽

High Throughput ◽

High Throughput Sequencing ◽

Environmental Dna ◽

Taxonomic Resolution ◽

Biodiversity Research ◽

Animal Diversity ◽

Mitochondrial Sequences ◽

The Many

Traditional methods of characterizing biodiversity are increasingly being supplemented and replaced by approaches based on DNA sequencing alone. These approaches commonly involve extraction and high-throughput sequencing of bulk samples from biologically complex communities or samples of environmental DNA (eDNA). In such cases, vouchers for individual organisms are rarely obtained, often unidentifiable, or unavailable. Thus, identifying these sequences typically relies on comparisons with sequences from genetic databases, particularly GenBank. While concerns have been raised about biases and inaccuracies in laboratory and analytical methods, comparatively little attention has been paid to the taxonomic reliability of GenBank itself. Here we analyze the metazoan mitochondrial sequences of GenBank using a combination of distance-based clustering and phylogenetic analysis. Because of their comparatively rapid evolutionary rates and consequent high taxonomic resolution, mitochondrial sequences represent an invaluable resource for the detection of the many small and often undescribed organisms that represent the bulk of animal diversity. We show that metazoan identifications in GenBank are surprisingly accurate, even at low taxonomic levels (likely <1% error rate at the genus level). This stands in contrast to previously voiced concerns based on limited analyses of particular groups and the fact that individual researchers currently submit annotated sequences to GenBank without significant external taxonomic validation. Our encouraging results suggest that the rapid uptake of DNA-based approaches is supported by a bioinformatic infrastructure capable of assessing both the losses to biodiversity caused by global change and the effectiveness of conservation efforts aimed at slowing or reversing these losses.

Download Full-text

Integrative analyses of transcriptome data reveal the mechanisms of post-transcriptional regulation

Briefings in Functional Genomics ◽

10.1093/bfgp/elab004 ◽

2021 ◽

Author(s):

Jinkai Wang

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Rna Binding ◽

Rna Binding Proteins ◽

Rapid Development ◽

Sequencing Data ◽

High Throughput Sequencing Data ◽

Public Resources ◽

Integrative Analyses ◽

Post Transcriptional Regulation

Abstract Post-transcriptional processing of RNAs plays important roles in a variety of physiological and pathological processes. These processes can be precisely controlled by a series of RNA binding proteins and cotranscriptionally regulated by transcription factors as well as histone modifications. With the rapid development of high-throughput sequencing techniques, multiomics data have been broadly used to study the mechanisms underlying the important biological processes. However, how to use these high-throughput sequencing data to elucidate the fundamental regulatory roles of post-transcriptional processes is still of great challenge. This review summarizes the regulatory mechanisms of post-transcriptional processes and the general principles and approaches to dissect these mechanisms by integrating multiomics data as well as public resources.

Download Full-text

Scaling up DNA metabarcoding for freshwater macrozoobenthos monitoring

10.7287/peerj.preprints.3456 ◽

2018 ◽

Author(s):

Vasco Elbrecht ◽

Dirk Steinke

Keyword(s):

High Throughput ◽

Illumina Sequencing ◽

Large Scale ◽

High Throughput Sequencing ◽

Scaling Up ◽

The Past ◽

Dna Metabarcoding ◽

Primer Sets ◽

Fusion Primer ◽

Positive Controls

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices, tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one possible metabarcoding approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.

Download Full-text