Estimating intraspecific genetic diversity from community DNA metabarcoding data

10.7287/peerj.preprints.3269v4 ◽

2018 ◽

Author(s):

Vasco Elbrecht ◽

Ecaterina Edith Vamos ◽

Dirk Steinke ◽

Florian Leese

Keyword(s):

Genetic Diversity ◽

Biological Diversity ◽

Intraspecific Diversity ◽

Great Promise ◽

Data Sets ◽

Mock Community ◽

Data Set ◽

Dna Metabarcoding ◽

Intraspecific Genetic Diversity ◽

Primer Sets

Background. DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTU), losing information on intraspecific diversity in the process. While COI haplotype information is limited in resolution, it is nevertheless useful in a phylogeographic context, helping to formulate hypothesis on taxon dispersal. Methods. This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotypes from freshwater macroinvertebrate metabarcoding data sets. This novel approach was added to the R package "JAMP" and can be applied to Cytochrome c oxidase subunit I (COI) amplicon datasets. We tested our haplotyping method by sequencing i) a single-species mock community composed of 31 individuals with different haplotypes spanning three orders of magnitude in biomass and ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. Results. We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177 - 200 OTUs, each containing an average of 2.40 to 3.30 haplotypes per OTU. Population structures were consistent between replicates, and similar between primer pairs, depending on the primer length. A closer look at abundant taxa in the data set revealed various population genetic patterns, e.g. Taeniopteryx nebulosa and Hydropsyche pellucidula with a difference in north-south haplotype distribution, while Oulimnius tuberculatus and Asellus aquaticus display no clear population pattern but differ in genetic diversity. Discussion. We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate monitoring samples using metabarcoding data. It needs to be stressed that at this point metabarcoding-informed haplotyping is not capable of capture the full diversity present in such samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity was recovered, identifying potentially isolated populations and potential taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding data sets to fully take advantage of our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about biological diversity but also underlying genetic diversity.

Download Full-text

Estimating intraspecific genetic diversity from community DNA metabarcoding data

10.7287/peerj.preprints.3269v3 ◽

2018 ◽

Author(s):

Vasco Elbrecht ◽

Ecaterina Edith Vamos ◽

Dirk Steinke ◽

Florian Leese

Keyword(s):

Genetic Diversity ◽

Biological Diversity ◽

Intraspecific Diversity ◽

Great Promise ◽

Data Sets ◽

Mock Community ◽

Data Set ◽

Dna Metabarcoding ◽

Intraspecific Genetic Diversity ◽

Primer Sets

Background. DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTU), losing information on intraspecific diversity in the process. While COI haplotype information is limited in resolution, it is nevertheless useful in a phylogeographic context, helping to formulate hypothesis on taxon dispersal. Methods. This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotypes from freshwater macroinvertebrate metabarcoding data sets. This novel approach was added to the R package "JAMP" and can be applied to Cytochrome c oxidase subunit I (COI) amplicon datasets. We tested our haplotyping method by sequencing i) a single-species mock community composed of 31 individuals with different haplotypes spanning three orders of magnitude in biomass and ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. Results. We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177 - 200 OTUs, each containing an average of 2.40 to 3.30 haplotypes per OTU. Population structures were consistent between replicates, and similar between primer pairs, depending on the primer length. A closer look at abundant taxa in the data set revealed various population genetic patterns, e.g. Taeniopteryx nebulosa and Hydropsyche pellucidula with a difference in north-south haplotype distribution, while Oulimnius tuberculatus and Asellus aquaticus display no clear population pattern but differ in genetic diversity. Discussion. We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate monitoring samples using metabarcoding data. It needs to be stressed that at this point metabarcoding-informed haplotyping is not capable of capture the full diversity present in such samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity was recovered, identifying potentially isolated populations and potential taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding data sets to fully take advantage of our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about biological diversity but also underlying genetic diversity.

Download Full-text

Assessing intraspecific genetic diversity from community DNA metabarcoding data

10.7287/peerj.preprints.3269v2 ◽

2018 ◽

Author(s):

Vasco Elbrecht ◽

Ecaterina Edith Vamos ◽

Dirk Steinke ◽

Florian Leese

Keyword(s):

Genetic Diversity ◽

Biological Diversity ◽

Intraspecific Diversity ◽

Great Promise ◽

Data Sets ◽

Mock Community ◽

Data Set ◽

Dna Metabarcoding ◽

Intraspecific Genetic Diversity ◽

Primer Sets

Background. DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTU), loosing information on intraspecific diversity in the process. Methods. This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance based filtering to extract haplotypes from freshwater macroinvertebrate metabarcoding data sets. This novel approach is implemented in the R package "JAMP" and can be applied to Cytochrome c oxidase subunit I (COI) amplicon datasets. We tested our haplotyping method by sequencing i) a single-species mock community composed of 31 individuals with different haplotypes spanning three orders of magnitude in biomass and ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. Results. We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177 - 200 OTUs, each containing an average of 2.40 to 3.30 haplotypes per OTU. Population structures were consistent between replicates, and similar between primer pairs, depending on the primer length. A closer look at abundant taxa in the data set revealed various population genetic patterns, e.g. Taeniopteryx nebulosa and Hydropsyche pellucidula with a difference in north-south haplotype distribution, while Oulimnius tuberculatus and Asellus aquaticus display no clear population pattern but differ in genetic diversity. Discussion. We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate samples using metabarcoding data. It needs to be stressed that at this point metabarcoding-informed haplotyping is not capable to capture the full diversity present in bulk samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity is recovered, identifying potentially isolated populations and potential taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding data sets to fully take advantage our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about biological diversity but also underlying genetic diversity.

Download Full-text

Estimating intraspecific genetic diversity from community DNA metabarcoding data

10.7287/peerj.preprints.3269 ◽

2018 ◽

Author(s):

Vasco Elbrecht ◽

Ecaterina Edith Vamos ◽

Dirk Steinke ◽

Florian Leese

Keyword(s):

Genetic Diversity ◽

Biological Diversity ◽

Intraspecific Diversity ◽

Great Promise ◽

Data Sets ◽

Mock Community ◽

Data Set ◽

Dna Metabarcoding ◽

Intraspecific Genetic Diversity ◽

Primer Sets

Background. DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTU), losing information on intraspecific diversity in the process. While COI haplotype information is limited in resolution, it is nevertheless useful in a phylogeographic context, helping to formulate hypothesis on taxon dispersal. Methods. This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotypes from freshwater macroinvertebrate metabarcoding data sets. This novel approach was added to the R package "JAMP" and can be applied to Cytochrome c oxidase subunit I (COI) amplicon datasets. We tested our haplotyping method by sequencing i) a single-species mock community composed of 31 individuals with different haplotypes spanning three orders of magnitude in biomass and ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. Results. We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177 - 200 OTUs, each containing an average of 2.40 to 3.30 haplotypes per OTU. Population structures were consistent between replicates, and similar between primer pairs, depending on the primer length. A closer look at abundant taxa in the data set revealed various population genetic patterns, e.g. Taeniopteryx nebulosa and Hydropsyche pellucidula with a difference in north-south haplotype distribution, while Oulimnius tuberculatus and Asellus aquaticus display no clear population pattern but differ in genetic diversity. Discussion. We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate monitoring samples using metabarcoding data. It needs to be stressed that at this point metabarcoding-informed haplotyping is not capable of capture the full diversity present in such samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity was recovered, identifying potentially isolated populations and potential taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding data sets to fully take advantage of our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about biological diversity but also underlying genetic diversity.

Download Full-text

Environmental DNA analysis shows high potential as a tool for estimating intraspecific genetic diversity in a wild fish population

10.1101/829770 ◽

2019 ◽

Cited By ~ 1

Author(s):

Satsuki Tsuji ◽

Atsushi Maruyama ◽

Masaki Miya ◽

Masayuki Ushio ◽

Hirotoshi Sato ◽

...

Keyword(s):

Genetic Diversity ◽

Water Sample ◽

Sanger Sequencing ◽

Large Scale ◽

High Throughput Sequencing ◽

Dna Analysis ◽

Environmental Dna ◽

Intraspecific Diversity ◽

Survey Method ◽

Intraspecific Genetic Diversity

AbstractEnvironmental DNA (eDNA) analysis has recently been used as a new tool for estimating intraspecific diversity. However, whether known haplotypes contained in a sample can be detected correctly using eDNA-based methods has been examined only by an aquarium experiment. Here, we tested whether the haplotypes of Ayu fish (Plecoglossus altivelis altivelis) detected in a capture survey could also be detected from an eDNA sample derived from the field that contained various haplotypes with low concentrations and foreign substances. A water sample and Ayu specimens collected from a river on the same day were analysed by eDNA analysis and Sanger sequencing, respectively. The 10 L water sample was divided into 20 filters for each of which 15 PCR replications were performed. After high-throughput sequencing, denoising was performed using two of the most widely used denoising packages, UNOISE3 and DADA2. Of the 42 haplotypes obtained from the Sanger sequencing of 96 specimens, 38 (UNOISE3) and 41 (DADA2) haplotypes were detected by eDNA analysis. When DADA2 was used, except for one haplotype, haplotypes owned by at least two specimens were detected from all the filter replications. This study showed that the eDNA analysis for evaluating intraspecific genetic diversity provides comparable results for large-scale capture-based conventional methods, suggesting that it could become a more efficient survey method for investigating intraspecific genetic diversity in the field.

Download Full-text

Scaling up DNA metabarcoding for freshwater macrozoobenthos monitoring

10.7287/peerj.preprints.3456v3 ◽

2018 ◽

Author(s):

Vasco Elbrecht ◽

Dirk Steinke

Keyword(s):

High Throughput ◽

Illumina Sequencing ◽

Large Scale ◽

High Throughput Sequencing ◽

Scaling Up ◽

The Past ◽

Dna Metabarcoding ◽

Primer Sets ◽

Fusion Primer ◽

Positive Controls

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.

Download Full-text

Scaling up DNA metabarcoding for freshwater macrozoobenthos monitoring

10.7287/peerj.preprints.3456v2 ◽

2018 ◽

Author(s):

Vasco Elbrecht ◽

Dirk Steinke

Keyword(s):

High Throughput ◽

Illumina Sequencing ◽

Large Scale ◽

High Throughput Sequencing ◽

Scaling Up ◽

The Past ◽

Dna Metabarcoding ◽

Primer Sets ◽

Fusion Primer ◽

Positive Controls

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.

Download Full-text

Scaling up DNA metabarcoding for freshwater macrozoobenthos monitoring

10.7287/peerj.preprints.3456 ◽

2018 ◽

Author(s):

Vasco Elbrecht ◽

Dirk Steinke

Keyword(s):

High Throughput ◽

Illumina Sequencing ◽

Large Scale ◽

High Throughput Sequencing ◽

Scaling Up ◽

The Past ◽

Dna Metabarcoding ◽

Primer Sets ◽

Fusion Primer ◽

Positive Controls

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices, tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one possible metabarcoding approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.

Download Full-text

Scaling up DNA metabarcoding for freshwater macrozoobenthos monitoring

10.7287/peerj.preprints.3456v4 ◽

2018 ◽

Cited By ~ 1

Author(s):

Vasco Elbrecht ◽

Dirk Steinke

Keyword(s):

High Throughput ◽

Illumina Sequencing ◽

Large Scale ◽

High Throughput Sequencing ◽

Scaling Up ◽

The Past ◽

Dna Metabarcoding ◽

Primer Sets ◽

Fusion Primer ◽

Positive Controls

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices, tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one possible metabarcoding approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.

Download Full-text

Scaling up DNA metabarcoding for freshwater macrozoobenthos monitoring

10.7287/peerj.preprints.3456v5 ◽

2018 ◽

Author(s):

Vasco Elbrecht ◽

Dirk Steinke

Keyword(s):

High Throughput ◽

Illumina Sequencing ◽

Large Scale ◽

High Throughput Sequencing ◽

Scaling Up ◽

The Past ◽

Dna Metabarcoding ◽

Primer Sets ◽

Fusion Primer ◽

Positive Controls

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices, tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one possible metabarcoding approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.

Download Full-text