scholarly journals Estimating intraspecific genetic diversity from community DNA metabarcoding data

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e4644 ◽  
Author(s):  
Vasco Elbrecht ◽  
Ecaterina Edith Vamos ◽  
Dirk Steinke ◽  
Florian Leese

BackgroundDNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high-throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTUs), losing information on intraspecific diversity in the process. While Cytochrome c oxidase subunit I (COI) haplotype information is limited in resolving intraspecific diversity it is nevertheless often useful e.g. in a phylogeographic context, helping to formulate hypotheses on taxon distribution and dispersal.MethodsThis study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotype information from freshwater macroinvertebrate metabarcoding datasets. This novel approach was added to the R package “JAMP” and can be applied to COI amplicon datasets. We tested our haplotyping method by sequencing (i) a single-species mock community composed of 31 individuals with 15 different haplotypes spanning three orders of magnitude in biomass and (ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates.ResultsWe detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177–200 OTUs, each containing an average of 2.40–3.30 haplotypes per OTU. The derived intraspecific diversity data showed population structures that were consistent between replicates and similar between primer pairs but resolution depended on the primer length. A closer look at abundant taxa in the dataset revealed various population genetic patterns, e.g. the stoneflyTaeniopteryx nebulosaand the caddisflyHydropsyche pellucidulashowed a distinct north–south cline with respect to haplotype distribution, while the beetleOulimnius tuberculatusand the isopodAsellus aquaticusdisplayed no clear population pattern but differed in genetic diversity.DiscussionWe developed a strategy to infer intraspecific genetic diversity from bulk invertebrate metabarcoding data. It needs to be stressed that at this point this metabarcoding-informed haplotyping is not capable of capturing the full diversity present in such samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity was recovered, identifying potentially isolated populations and taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding datasets to fully take advantage of our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about species diversity but also underlying genetic diversity.

2018 ◽  
Author(s):  
Vasco Elbrecht ◽  
Ecaterina Edith Vamos ◽  
Dirk Steinke ◽  
Florian Leese

Background. DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTU), losing information on intraspecific diversity in the process. While COI haplotype information is limited in resolution, it is nevertheless useful in a phylogeographic context, helping to formulate hypothesis on taxon dispersal. Methods. This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotypes from freshwater macroinvertebrate metabarcoding data sets. This novel approach was added to the R package "JAMP" and can be applied to Cytochrome c oxidase subunit I (COI) amplicon datasets. We tested our haplotyping method by sequencing i) a single-species mock community composed of 31 individuals with different haplotypes spanning three orders of magnitude in biomass and ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. Results. We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177 - 200 OTUs, each containing an average of 2.40 to 3.30 haplotypes per OTU. Population structures were consistent between replicates, and similar between primer pairs, depending on the primer length. A closer look at abundant taxa in the data set revealed various population genetic patterns, e.g. Taeniopteryx nebulosa and Hydropsyche pellucidula with a difference in north-south haplotype distribution, while Oulimnius tuberculatus and Asellus aquaticus display no clear population pattern but differ in genetic diversity. Discussion. We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate monitoring samples using metabarcoding data. It needs to be stressed that at this point metabarcoding-informed haplotyping is not capable of capture the full diversity present in such samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity was recovered, identifying potentially isolated populations and potential taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding data sets to fully take advantage of our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about biological diversity but also underlying genetic diversity.


2018 ◽  
Author(s):  
Vasco Elbrecht ◽  
Ecaterina Edith Vamos ◽  
Dirk Steinke ◽  
Florian Leese

Background. DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTU), losing information on intraspecific diversity in the process. While COI haplotype information is limited in resolution, it is nevertheless useful in a phylogeographic context, helping to formulate hypothesis on taxon dispersal. Methods. This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotypes from freshwater macroinvertebrate metabarcoding data sets. This novel approach was added to the R package "JAMP" and can be applied to Cytochrome c oxidase subunit I (COI) amplicon datasets. We tested our haplotyping method by sequencing i) a single-species mock community composed of 31 individuals with different haplotypes spanning three orders of magnitude in biomass and ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. Results. We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177 - 200 OTUs, each containing an average of 2.40 to 3.30 haplotypes per OTU. Population structures were consistent between replicates, and similar between primer pairs, depending on the primer length. A closer look at abundant taxa in the data set revealed various population genetic patterns, e.g. Taeniopteryx nebulosa and Hydropsyche pellucidula with a difference in north-south haplotype distribution, while Oulimnius tuberculatus and Asellus aquaticus display no clear population pattern but differ in genetic diversity. Discussion. We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate monitoring samples using metabarcoding data. It needs to be stressed that at this point metabarcoding-informed haplotyping is not capable of capture the full diversity present in such samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity was recovered, identifying potentially isolated populations and potential taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding data sets to fully take advantage of our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about biological diversity but also underlying genetic diversity.


2018 ◽  
Author(s):  
Vasco Elbrecht ◽  
Ecaterina Edith Vamos ◽  
Dirk Steinke ◽  
Florian Leese

Background. DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTU), loosing information on intraspecific diversity in the process. Methods. This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance based filtering to extract haplotypes from freshwater macroinvertebrate metabarcoding data sets. This novel approach is implemented in the R package "JAMP" and can be applied to Cytochrome c oxidase subunit I (COI) amplicon datasets. We tested our haplotyping method by sequencing i) a single-species mock community composed of 31 individuals with different haplotypes spanning three orders of magnitude in biomass and ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. Results. We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177 - 200 OTUs, each containing an average of 2.40 to 3.30 haplotypes per OTU. Population structures were consistent between replicates, and similar between primer pairs, depending on the primer length. A closer look at abundant taxa in the data set revealed various population genetic patterns, e.g. Taeniopteryx nebulosa and Hydropsyche pellucidula with a difference in north-south haplotype distribution, while Oulimnius tuberculatus and Asellus aquaticus display no clear population pattern but differ in genetic diversity. Discussion. We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate samples using metabarcoding data. It needs to be stressed that at this point metabarcoding-informed haplotyping is not capable to capture the full diversity present in bulk samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity is recovered, identifying potentially isolated populations and potential taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding data sets to fully take advantage our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about biological diversity but also underlying genetic diversity.


2018 ◽  
Author(s):  
Vasco Elbrecht ◽  
Ecaterina Edith Vamos ◽  
Dirk Steinke ◽  
Florian Leese

Background. DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTU), losing information on intraspecific diversity in the process. While COI haplotype information is limited in resolution, it is nevertheless useful in a phylogeographic context, helping to formulate hypothesis on taxon dispersal. Methods. This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotypes from freshwater macroinvertebrate metabarcoding data sets. This novel approach was added to the R package "JAMP" and can be applied to Cytochrome c oxidase subunit I (COI) amplicon datasets. We tested our haplotyping method by sequencing i) a single-species mock community composed of 31 individuals with different haplotypes spanning three orders of magnitude in biomass and ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. Results. We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177 - 200 OTUs, each containing an average of 2.40 to 3.30 haplotypes per OTU. Population structures were consistent between replicates, and similar between primer pairs, depending on the primer length. A closer look at abundant taxa in the data set revealed various population genetic patterns, e.g. Taeniopteryx nebulosa and Hydropsyche pellucidula with a difference in north-south haplotype distribution, while Oulimnius tuberculatus and Asellus aquaticus display no clear population pattern but differ in genetic diversity. Discussion. We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate monitoring samples using metabarcoding data. It needs to be stressed that at this point metabarcoding-informed haplotyping is not capable of capture the full diversity present in such samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity was recovered, identifying potentially isolated populations and potential taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding data sets to fully take advantage of our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about biological diversity but also underlying genetic diversity.


2019 ◽  
Author(s):  
Satsuki Tsuji ◽  
Atsushi Maruyama ◽  
Masaki Miya ◽  
Masayuki Ushio ◽  
Hirotoshi Sato ◽  
...  

AbstractEnvironmental DNA (eDNA) analysis has recently been used as a new tool for estimating intraspecific diversity. However, whether known haplotypes contained in a sample can be detected correctly using eDNA-based methods has been examined only by an aquarium experiment. Here, we tested whether the haplotypes of Ayu fish (Plecoglossus altivelis altivelis) detected in a capture survey could also be detected from an eDNA sample derived from the field that contained various haplotypes with low concentrations and foreign substances. A water sample and Ayu specimens collected from a river on the same day were analysed by eDNA analysis and Sanger sequencing, respectively. The 10 L water sample was divided into 20 filters for each of which 15 PCR replications were performed. After high-throughput sequencing, denoising was performed using two of the most widely used denoising packages, UNOISE3 and DADA2. Of the 42 haplotypes obtained from the Sanger sequencing of 96 specimens, 38 (UNOISE3) and 41 (DADA2) haplotypes were detected by eDNA analysis. When DADA2 was used, except for one haplotype, haplotypes owned by at least two specimens were detected from all the filter replications. This study showed that the eDNA analysis for evaluating intraspecific genetic diversity provides comparable results for large-scale capture-based conventional methods, suggesting that it could become a more efficient survey method for investigating intraspecific genetic diversity in the field.


2018 ◽  
Author(s):  
Vasco Elbrecht ◽  
Dirk Steinke

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.


2018 ◽  
Author(s):  
Vasco Elbrecht ◽  
Dirk Steinke

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.


2018 ◽  
Author(s):  
Vasco Elbrecht ◽  
Dirk Steinke

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices, tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one possible metabarcoding approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.


Author(s):  
Vasco Elbrecht ◽  
Dirk Steinke

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices, tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one possible metabarcoding approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.


2018 ◽  
Author(s):  
Vasco Elbrecht ◽  
Dirk Steinke

The viability of DNA metabarcoding for assessment of freshwater macrozoobenthos has been demonstrated over the past years. It matured to a stage where it can be applied to monitoring at a large scale, keeping pace with increased high throughput sequencing (HTS) capacity. However, workflows and sample tagging need to be optimized to accommodate for hundreds of samples within a single sequencing run. We here conceptualize a streamlined metabarcoding workflow, in which samples are processed in 96-well plates. Each sample is replicated starting with tissue extraction. Negative and positive controls are included to ensure data reliability. With our newly developed fusion primer sets for the BF2+BR2 primer pair up to three 96-well plates (288 wells) can be uniquely tagged for a single Illumina sequencing run. By including Illumina indices, tagging can be extended to thousands of samples. We hope that our metabarcoding workflow will be used as a practical guide for future large-scale biodiversity assessments involving freshwater invertebrates. However, we also want to point out that this is just one possible metabarcoding approach, and that we hope this article will stimulate discussion and publication of alternatives and extensions.


Sign in / Sign up

Export Citation Format

Share Document