Development and validation of DNA metabarcoding COI primers for aquatic invertebrates using the R package "PrimerMiner"
1) DNA metabarcoding is a powerful tool to assess biodiversity by amplifying and sequencing a standardized gene marker region. However, typically used barcoding genes, such as the cytochrome c oxidase subunit I (COI) region for animals, are highly variable. Thus, different taxa in communities under study are often not amplified equally well and some might even remain undetected due to primer bias. To reduce these problems, optimized region- and/or ecosystem- specific metabarcoding primers are necessary. 2) We developed the R package PrimerMiner, which batch downloads DNA barcode gene sequences from BOLD and NCBI databases for specified target taxa and then applies sequence clustering to reduce biases introduced by differed number of available sequences per species. To design primers targeted for freshwater invertebrates, we downloaded COI data for the 15 most important invertebrate groups relevant for stream ecosystem assessment. Four primer sets with high base degeneracy were developed and their performance tested by sequencing ten mock community samples consisting each of 52 freshwater invertebrate taxa. Additionally, we evaluated the developed primers against other metabarcoding primers in silico using PrimerMiner. 3) Amplification and sequencing was successful for all ten mock community samples with the four different primer combinations. The developed primers varied in amplification efficiency and amount of taxa detected, but all primer sets detected more taxa than standard Folmer barcoding primers. Additionally, the BF / BR primers amplified taxa very consistently, the BF2+BR2 and BF2+BR1 primer combination even better than a previously tested ribosomal marker (16S). Except for the BF1+BR1 primer combination, all BF / BR primers detected all 42 insect taxa present in the mock samples. In silico evaluation of the developed primers showed that they are also likely to work very well on other non aquatic invertebrate samples. 4) With PrimerMiner, we here provide a useful tool to obtain relevant sequence data for targeted primer development and evaluation. Our sequence datasets generated with the newly developed metabarcoding primers demonstrate that the design of optimized primers with high base degeneracy is superior to classical markers and enable us to detect almost 100% of animal taxa present in a sample using the standard COI barcoding gene. Therefore, the PrimerMiner package and primers developed using this tool are useful beyond assessment of biodiversity in aquatic ecosystems.