scholarly journals Forensics and DNA Barcodes – Do Identification Errors Arise in the Lab or in the Sequence Libraries?

2019 ◽  
Author(s):  
Mikko Pentinsaari ◽  
Sujeevan Ratnasingham ◽  
Scott E. Miller ◽  
Paul D. N. Hebert

AbstractForensic studies often require the determination of biological materials to a species level. As such, DNA-based approaches to identification, particularly DNA barcoding, are attracting increased interest. The capacity of DNA barcodes to assign newly encountered specimens to a species relies upon access to informatics platforms, such as BOLD and GenBank, which host libraries of reference sequences and support the comparison of new sequences to them. As parameterization of these libraries expands, DNA barcoding has the potential to make valuable contributions in diverse forensic contexts. However, a recent publication called for caution after finding that both platforms performed poorly in identifying specimens of 17 common insect species. This study follows up on this concern by asking if the misidentifications reflected problems in the reference libraries or in the query sequences used to test them. Because this reanalysis revealed that missteps in acquiring and analyzing the query sequences were responsible for the misidentifications, a workflow is described to minimize such errors in future investigations. The present study also revealed the limitations imposed by the lack of a polished species-level taxonomy for many groups. In such cases, forensic applications can be strengthened by mapping the geographic distributions of sequence-based species proxies rather than waiting for the maturation of formal taxonomic systems based on morphology.

Author(s):  
Takeru Nakazato

DNA barcoding technology has become employed widely for biodiversity and molecular biology researchers to identify species and analyze their phylogeny. Recently, DNA metabarcoding and environmental DNA (eDNA) technology have developed by expanding the concept of DNA barcoding. These techniques analyze the diversity and quantity of organisms within an environment by detecting biogenic DNA in water and soil. It is particularly popular for monitoring fish species living in rivers and lakes (Takahara et al. 2012). BOLD Systems (Barcode of Life Database systems, Ratnasingham and Hebert 2007) is a database for DNA barcoding, archiving 8.5 million of barcodes (as of August 2020) along with the voucher specimen, from which the DNA barcode sequence is derived, including taxonomy, collected country, and museum vouchered as metadata (e.g. https://www.boldsystems.org/index.php/Public_RecordView?processid=TRIBS054-16). Also, many barcoding data are submitted to GenBank (Sayers et al. 2020), which is a database for DNA sequences managed by NCBI (National Center for Biotechnology Information, US). The number of the records of DNA barcodes, i.e. COI (cytochrome c oxidase I) gene for animal, has grown significantly (Porter and Hajibabaei 2018). BOLD imports DNA barcoding data from GenBank, and lots of DNA barcoding data in GenBank are also assigned BOLD IDs. However, we have to refer to both BOLD and GenBank data when performing DNA barcoding. I have previously investigated the registration of DNA barcoding data in GenBank, especially the association with BOLD, using insects and flowering plants as examples (Nakazato 2019). Here, I surveyed the number of species covered by BOLD and GenBank. I used fish data as an example because eDNA research is particularly focused on fish. I downloaded all GenBank files for vertebrates from NCBI FTP (File Transfer Protocol) sites (as of November 2019). Of the GenBank fish entries, 86,958 (7.3%) were assigned BOLD identifiers (IDs). The NCBI taxonomy database has registrations for 39,127 species of fish, and 20,987 scientific names at the species level (i.e., excluding names that included sp., cf. or aff.). GenBank entries with BOLD IDs covered 11,784 species (30.1%) and 8,665 species-level names (41.3%). I also obtained whole "specimens and sequences combined data" for fish from BOLD systems (as of November 2019). In the BOLD, there are 273,426 entries that are registered as fish. Of these entries, 211,589 BOLD entries were assigned GenBank IDs, i.e. with values in “genbank_accession” column, and 121,748 entries were imported from GenBank, i.e. with "Mined from GenBank, NCBI" description in "institution_storing" column. The BOLD data covered 18,952 fish species and 15,063 species-level names, but 35,500 entries were assigned no species-level names and 22,123 entries were not even filled with family-level names. At the species level, 8,067 names co-occurred in GenBank and BOLD, with 6,997 BOLD-specific names and 599 GenBank-specific names. GenBank has 425,732 fish entries with voucher IDs, of which 340,386 were not assigned a BOLD ID. Of these 340,386 entries, 43,872 entries are registrations for COI genes, which could be candidates for DNA barcodes. These candidates include 4,201 species that are not included in BOLD, thus adding these data will enable us to identify 19,863 fish to the species level. For researchers, it would be very useful if both BOLD and GenBank DNA barcoding data could be searched in one place. For this purpose, it is necessary to integrate data from the two databases. A lot of biodiversity data are recorded based on the Darwin Core standard while DNA sequencing data are sometimes integrated or cross-linked by RDF (Resource Description Framework). It may not be technically difficult to integrate these data, but the species data referenced differ from the EoL (The Encyclopedia of Life) for BOLD and the NCBI taxonomy for GenBank, and the differences in taxonomic systems make it difficult to match by scientific name description. GenBank has fields for the latitude and longitude of the specimens sampled, and Porter and Hajibabaei 2018 argue that this information should be enhanced. However, this information may be better described in the specimen and occurrence databases. The integration of barcoding data with the specimen and occurrence data will solve these problems. Most importantly, it will save the researcher from having to register the same information in multiple databases. In the field of biodiversity, only DNA barcode sequences may have been focused on and used as gene sequences. The museomics community regards museum-preserved specimens as rich resources for DNA studies because their biodiversity information can accompany the extraction and analysis of their DNA (Nakazato 2018). GenBank is useful for biodiversity studies due to its low rate of mislabelling (Leray et al. 2019). In the future, we will be working with a variety of DNA, including genomes from museum specimens as well as DNA barcoding. This will require more integrated use of biodiversity information and DNA sequence data. This integration is also of interest to molecular biologists and bioinformaticians.


Parasitology ◽  
2020 ◽  
Vol 147 (13) ◽  
pp. 1499-1508
Author(s):  
Susanne Reier ◽  
Helmut Sattmann ◽  
Thomas Schwaha ◽  
Hans-Peter Fuehrer ◽  
Elisabeth Haring

AbstractAcanthocephalans are obligate parasites of vertebrates, mostly of fish. There is limited knowledge about the diversity of fish-parasitizing Acanthocephala in Austria. Seven determined species and an undetermined species are recorded for Austrian waters. Morphological identification of acanthocephalans remains challenging due to their sparse morphological characters and their high intraspecific variations. DNA barcoding is an effective tool for taxonomic assignment at the species level. In this study, we provide new DNA barcoding data for three genera of Acanthocephala (Pomphorhynchus Monticelli, 1905, Echinorhynchus Zoega in Müller, 1776 and Acanthocephalus Koelreuter, 1771) obtained from different fish species in Austria and provide an important contribution to acanthocephalan taxonomy and distribution in Austrian fish. Nevertheless, the taxonomic assignment of one species must remain open. We found indications for cryptic species within Echinorhynchus cinctulus Porta, 1905. Our study underlines the difficulties in processing reliable DNA barcodes and highlights the importance of the establishment of such DNA barcodes to overcome these. To achieve this goal, it is necessary to collect and compare material across Europe allowing a comprehensive revision of the phylum in Europe.


2016 ◽  
Vol 371 (1702) ◽  
pp. 20160025 ◽  
Author(s):  
Xin Zhou ◽  
Paul B. Frandsen ◽  
Ralph W. Holzenthal ◽  
Clare R. Beet ◽  
Kristi R. Bennett ◽  
...  

DNA barcoding was intended as a means to provide species-level identifications through associating DNA sequences from unknown specimens to those from curated reference specimens. Although barcodes were not designed for phylogenetics, they can be beneficial to the completion of the Tree of Life. The barcode database for Trichoptera is relatively comprehensive, with data from every family, approximately two-thirds of the genera, and one-third of the described species. Most Trichoptera, as with most of life's species, have never been subjected to any formal phylogenetic analysis. Here, we present a phylogeny with over 16 000 unique haplotypes as a working hypothesis that can be updated as our estimates improve. We suggest a strategy of implementing constrained tree searches, which allow larger datasets to dictate the backbone phylogeny, while the barcode data fill out the tips of the tree. We also discuss how this phylogeny could be used to focus taxonomic attention on ambiguous species boundaries and hidden biodiversity. We suggest that systematists continue to differentiate between ‘Barcode Index Numbers’ (BINs) and ‘species’ that have been formally described. Each has utility, but they are not synonyms. We highlight examples of integrative taxonomy, using both barcodes and morphology for species description. This article is part of the themed issue ‘From DNA barcodes to biomes’.


2020 ◽  
Vol 21 (8) ◽  
Author(s):  
Viet The Ho ◽  
MINH PHUONG NGUYEN

Abstract. Ho VT, Nguyen MP. 2020. An in silico approach for evaluation of rbcL and matK loci for DNA barcoding of Cucurbitaceae family. Biodiversitas 21: 3879-3885. DNA barcodes have been used intensively to discriminate different species in Cucurbitaceae family. The main of this study is to evaluate the effectiveness of rbcL and matK loci for 16 species of Cucurbitaceae family by using in silico approach. For analysis, sequences were firstly retrieved from NCBI and then calculated for sequence parameters. Sequences were then aligned and constructed phylogenetic try and examined for species resolution ability. The obtained data show the variability of resolving capacity among species. rbcL region is suitable for distinguishing five species namely S. edule, M. cochinchinensis, L. aegyptiaca, C. melo, and C. pepo, whereas matK locus is more proper for different five species consisting of M. balsamina, M. cochinchinensis, M. charantia, S. edule, and C. sativus. The resolving power is improved sharply by analyzing the rbcL + matK combination with up to nine species consisting of C. lanatus, B. hispida, C. melo, C. sativus, C. pepo, C. agryrosperma, L. aegyptiaca, S. edule, and M. cochinchinensis. Therefore, the integration of rbcL and matK loci may improve the competence of assessing genetic relatedness at species level of members in Cucurbitaceae family. The obtained information could be important for choosing proper DNA barcode loci for phylogenetic study of this crop family.


2020 ◽  
Vol 11 (2) ◽  
pp. 145-152
Author(s):  
Nevenka Ćelepirović ◽  
Sanja Novak Agbaba ◽  
Monika Karija Vlahović

The saprotrophic, endophytic, and parasitic fungi were detected from the samples collected in the forest of the management unit East Psunj and Papuk Nature Park in Croatia. The disease symptoms, the morphology of fruiting bodies and fungal culture, and DNA barcoding were combined for determining the fungi at the genus or species level. DNA barcoding is a standardized and automated identification of species based on recognition of highly variable DNA sequences. DNA barcoding has a wide application in the diagnostic purpose of fungi in biological specimens. DNA samples for DNA barcoding were isolated from infected tree tissues, fungal fruiting bodies or fungal cultures. The ITS or ITS2 sequences of the fungal DNA were sequenced and aligned with the reference sequences in GenBank (NCBI) using BLAST. The size of ITS and ITS2 sequences were 512-584 bp and 248-326 bp, respectively. The sequences showed a high identity of 97.21%-100% at 98%-100% coverage with reference sequences in GenBank (NCBI). The exception was the species Amphilogia gyrosa that showed 95.65% identity at 100% coverage. Two fungi were determined at genus level: Cladosporium sp., and Cytospora sp., while 11 fungi were determined at species level: Alternaria alternata, Aureobasidium pullulans, Amphilogia gyrosa, Capronia pilosella, Cryphonectria parasitica, Exidia glandulosa, Epicoccum nigrum, Penicillium glabrum, Pezicula carpinea, Rosellinia corticium, and Stereum hirsutum.


2019 ◽  
Author(s):  
Vincent Manzanilla ◽  
Irene Teixidor-Toneu ◽  
Gary J. Martin ◽  
Peter M. Hollingsworth ◽  
Hugo J. de Boer ◽  
...  

AbstractUncontrolled and unsustainable trade in natural resources is an increasingly important threat to global biodiversity. In recent years, molecular identification methods have been proposed as tools to monitor global supply chains, to support regulation and legislative protection of species in trade, and enhancing consumer protection by establishing whether a traded product contains the species it is supposed to contain. However, development of an effective assay that routinely provides species-level identification and information on geographical origin of plants remains elusive, with standard plant DNA barcodes often providing only ‘species group’ or genus-level resolution. Here, we demonstrate the efficacy of target-capture genomic DNA barcoding, based on 443 nuclear markers, for establishing the identity and geographic origin of samples traded as the red-listed medicinal plant Anacyclus pyrethrum (L.) Lag. We also use this approach to provide insights into product adulteration and substitution in national and international supply chains. Compared with standard plant DNA barcodes and entire plastid genome sequences, the target capture approach outperforms other methods, and works with DNA from degraded samples. This approach offers the potential to meet the ‘holy-grail’ of plant DNA barcoding, namely routine species-level DNA-based identification, and also providing insights into geographic origin. This represents a major development for biodiversity conservation and for supporting the regulation and monitoring of trade in natural plant products.SignificanceUnsustainable exploitation of natural resources is a major driver of biodiversity loss. Up to a third of the world’s biodiversity is considered threatened by trade, but a lack of traceability methods for traded products impedes evaluation of international supply chains and the global impacts of trade on biodiversity. In this study, we pioneer the use of target capture-based genomic DNA barcoding. We compare this with standard DNA barcodes and complete plastid genome sequences for the identification of plants species in trade and for tracing their geographic origin. The target-capture barcoding approach described here presents a major advance for tracing the geographic origin of plant-based food and medicines and establishing the identity of illegally traded species. It enables better understanding and targeting of conservation action, and enhances capacity to assess the quality, safety and authenticity of traded products.


2020 ◽  
Author(s):  
Conny P. Serite ◽  
Ofentse K. Ntshudisane ◽  
Eugene Swart ◽  
Luisa Simbine ◽  
Graça L. M. Jaime ◽  
...  

AbstractSeahorses and pipefishes are heavily exploited for use in Traditional Chinese Medicine (TCM), and less frequently for curio markets or as aquarium fish. A number of recent studies have used DNA barcoding to identify species sold at TCM markets in East Asia, but the usefulness of this approach in determining the region of origin remains poorly explored. Here, we generated DNA barcodes of dried seahorses and pipefishes destined for TCM that were confiscated at South Africa’s largest airport because they lacked the export permits required for the CITES-listed seahorses. These were compared with published sequences and new sequences generated for Mozambican seahorses, with the aim of determining whether it is possible to identify their country of origin. All pipefishes were identified as Syngnathoides biaculeatus, a widespread Indo-Pacific species, but the published sequence data did not provide sufficient resolution to identify the region of origin. The same was true of the majority of seahorses, which could not even be identified to species level because they clustered among an unresolved species complex whose sequences were published under the names Hippocampus kuda, H. fuscus and H. capensis. The presence of a few specimens of a second seahorse, H. camelopardalis, suggests that the shipment originated from East Africa because the range of this seahorse is centred around this region, but again, it was not possible to determine their country of origin. Even though seahorses and pipefishes have high levels of genetic population structure because of their low dispersal potential, DNA barcoding was only suitable to tentatively identify species, but not their region of origin. DNA barcoding is increasingly used to identify illegally traded wildlife, but our results show that more sophisticated methods are needed to monitor and police the trade in seahorses and pipefishes.


2019 ◽  
Vol 42 (2) ◽  
pp. 137-150
Author(s):  
Konstantin A. Efetov ◽  
Anna V. Kirsanova ◽  
Zoya S. Lazareva ◽  
Ekaterina V. Parshkova ◽  
Gerhard M. Tarmann ◽  
...  

The present study provides a DNA barcode library for the world Zygaenidae (Lepidoptera). This study reports 1031 sequence data of the COI gene DNA barcodes for more than 240 species in four of the five subfamilies of the family Zygaenidae. This is about 20% of the world Zygaenidae species. Our results demonstrate the specificity of the COI gene sequences at the species level in most of the studied Zygaenidae and agree with already established taxonomic opinions. The study confirms the effectiveness of DNA barcoding as a tool for determination of most Zygaenidae species. However, some of the results are contradictory. Some cases of shared barcodes have been found, as well as cases of deep intraspecific sequence divergence in species that are well separated by morphological and biological characters. These cases are discussed in detail. Overall, when combined with morphological and biochemical data, as well as biological and ecological observations, DNA barcoding results can be a useful support for taxonomic decisions.


Author(s):  
Marc J.C. de Jong ◽  
Wim M. Busing ◽  
Max T. Otten

Biological materials damage rapidly in the electron beam, limiting the amount of information that can be obtained in the transmission electron microscope. The discovery that observation at cryo temperatures strongly reduces beam damage (in addition to making it unnecessaiy to use chemical fixatives, dehydration agents and stains, which introduce artefacts) has given an important step forward to preserving the ‘live’ situation and makes it possible to study the relation between function, chemical composition and morphology.Among the many cryo-applications, the most challenging is perhaps the determination of the atomic structure. Henderson and co-workers were able to determine the structure of the purple membrane by electron crystallography, providing an understanding of the membrane's working as a proton pump. As far as understood at present, the main stumbling block in achieving high resolution appears to be a random movement of atoms or molecules in the specimen within a fraction of a second after exposure to the electron beam, which destroys the highest-resolution detail sought.


2016 ◽  
Vol 2 (1) ◽  
pp. 37-42 ◽  
Author(s):  
J.M. Pino Moreno ◽  
A. Ganguly

In the present paper we have determined the fatty acid content of some edible insects of Mexico. A comparative analysis of the insect species studied in this research showed that caproic acid was present in a minimal proportion which ranged between 0.01 for Periplaneta americana (nymphs) and 0.06 (g/100 g, dry basis) for Euschistus strenuus. The highest proportion of caprilic acid (0.09) was found in Tenebrio molitor (adults). Atta sp. had the highest amount of capric acid (0.26). Polistes sp. was found to be rich in lauric acid (0.77) and for myristic acid it had the highest content (5.64). Dactylopius sp. and E. strenuus were rich in palmitic acid (14.89). Euschistus taxcoensis had the highest quantity of palmitoleic acid (12.06). Llaveia axin exhibited the highest quantity of stearic acid (22.75). Polistes sp. was found to be rich in oleic acid (38.28). The highest quantity of linoleic acid was observed in T. molitor (larvae) (10.89), and in L. axin the highest content of linolenic acid (7.82) was obtained. A comparison between the species under the present investigation revealed that, in general, the insects are poor in caproic, caprilic, capric, lauric, myristic, palmitoleic and linolenic acids, because the quantities were either minimal or could not be detected at all. They had moderate quantities of stearic, palmitic and linoleic acids and had high quantities of oleic acid. Finally it was concluded that although a particular insect species is unable to fulfil the total fatty acid need for a human, if consumed in combination they could definitely be able to supply a good amount of this highly valued nutrient.


Sign in / Sign up

Export Citation Format

Share Document