scholarly journals Assessing the reliability of medicinal Dendrobium sequences in GenBank for botanical species identification

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Hoi-Yan Wu ◽  
Kwun-Tin Chan ◽  
Grace Wing-Chiu But ◽  
Pang-Chui Shaw

AbstractDNA-based method is a promising tool in species identification and is widely used in various fields. DNA barcoding method has already been included in different pharmacopoeias for identification of medicinal materials or botanicals. Accuracy and validity of DNA-based methods rely on the accuracy and taxonomic reliability of the DNA sequences in the database to be compared against. Here we evaluated the annotation quality and taxonomic reliability of selected barcode loci (rbcL, matK, psbA-trnH, trnL-trnF and ITS) of 41 medicinal Dendrobium species downloaded from GenBank. Annotations of most accessions are incomplete. Only 53.06% of the 2041 accessions downloaded contain a reference to a voucher specimen. Only 31.60% and 4.8% of the entries are annotated with country of origin and collector or assessor, respectively. Taxonomic reliability of the sequences was evaluated by a Megablast search based on similarity to sequences submitted by other research groups. A small number of sequences (211, 7.14%) was regarded as highly doubted. Moreover, 10 out of 60 complete chloroplast genomes contain highly doubted sequences. Our findings suggest that sequences of GenBank should be used with caution for species-level identification. The scientific community should provide more important information regarding identity and traceability of the sample when they deposit sequences to public databases.

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e4499 ◽  
Author(s):  
Aisha Tahir ◽  
Fatma Hussain ◽  
Nisar Ahmed ◽  
Abdolbaset Ghorbani ◽  
Amer Jamil

In pursuit of developing fast and accurate species-level molecular identification methods, we tested six DNA barcodes, namely ITS2, matK, rbcLa, ITS2+matK, ITS2+rbcLa, matK+rbcLa and ITS2+matK+rbcLa, for their capacity to identify frequently consumed but geographically isolated medicinal species of Fabaceae and Poaceae indigenous to the desert of Cholistan. Data were analysed by BLASTn sequence similarity, pairwise sequence divergence in TAXONDNA, and phylogenetic (neighbour-joining and maximum-likelihood trees) methods. Comparison of six barcode regions showed that ITS2 has the highest number of variable sites (209/360) for tested Fabaceae and (106/365) Poaceae species, the highest species-level identification (40%) in BLASTn procedure, distinct DNA barcoding gap, 100% correct species identification in BM and BCM functions of TAXONDNA, and clear cladding pattern with high nodal support in phylogenetic trees in both families. ITS2+matK+rbcLa followed ITS2 in its species-level identification capacity. The study was concluded with advocating the DNA barcoding as an effective tool for species identification and ITS2 as the best barcode region in identifying medicinal species of Fabaceae and Poaceae. Current research has practical implementation potential in the fields of pharmaco-vigilance, trade of medicinal plants and biodiversity conservation.


Author(s):  
Takeru Nakazato

DNA barcoding technology has become employed widely for biodiversity and molecular biology researchers to identify species and analyze their phylogeny. Recently, DNA metabarcoding and environmental DNA (eDNA) technology have developed by expanding the concept of DNA barcoding. These techniques analyze the diversity and quantity of organisms within an environment by detecting biogenic DNA in water and soil. It is particularly popular for monitoring fish species living in rivers and lakes (Takahara et al. 2012). BOLD Systems (Barcode of Life Database systems, Ratnasingham and Hebert 2007) is a database for DNA barcoding, archiving 8.5 million of barcodes (as of August 2020) along with the voucher specimen, from which the DNA barcode sequence is derived, including taxonomy, collected country, and museum vouchered as metadata (e.g. https://www.boldsystems.org/index.php/Public_RecordView?processid=TRIBS054-16). Also, many barcoding data are submitted to GenBank (Sayers et al. 2020), which is a database for DNA sequences managed by NCBI (National Center for Biotechnology Information, US). The number of the records of DNA barcodes, i.e. COI (cytochrome c oxidase I) gene for animal, has grown significantly (Porter and Hajibabaei 2018). BOLD imports DNA barcoding data from GenBank, and lots of DNA barcoding data in GenBank are also assigned BOLD IDs. However, we have to refer to both BOLD and GenBank data when performing DNA barcoding. I have previously investigated the registration of DNA barcoding data in GenBank, especially the association with BOLD, using insects and flowering plants as examples (Nakazato 2019). Here, I surveyed the number of species covered by BOLD and GenBank. I used fish data as an example because eDNA research is particularly focused on fish. I downloaded all GenBank files for vertebrates from NCBI FTP (File Transfer Protocol) sites (as of November 2019). Of the GenBank fish entries, 86,958 (7.3%) were assigned BOLD identifiers (IDs). The NCBI taxonomy database has registrations for 39,127 species of fish, and 20,987 scientific names at the species level (i.e., excluding names that included sp., cf. or aff.). GenBank entries with BOLD IDs covered 11,784 species (30.1%) and 8,665 species-level names (41.3%). I also obtained whole "specimens and sequences combined data" for fish from BOLD systems (as of November 2019). In the BOLD, there are 273,426 entries that are registered as fish. Of these entries, 211,589 BOLD entries were assigned GenBank IDs, i.e. with values in “genbank_accession” column, and 121,748 entries were imported from GenBank, i.e. with "Mined from GenBank, NCBI" description in "institution_storing" column. The BOLD data covered 18,952 fish species and 15,063 species-level names, but 35,500 entries were assigned no species-level names and 22,123 entries were not even filled with family-level names. At the species level, 8,067 names co-occurred in GenBank and BOLD, with 6,997 BOLD-specific names and 599 GenBank-specific names. GenBank has 425,732 fish entries with voucher IDs, of which 340,386 were not assigned a BOLD ID. Of these 340,386 entries, 43,872 entries are registrations for COI genes, which could be candidates for DNA barcodes. These candidates include 4,201 species that are not included in BOLD, thus adding these data will enable us to identify 19,863 fish to the species level. For researchers, it would be very useful if both BOLD and GenBank DNA barcoding data could be searched in one place. For this purpose, it is necessary to integrate data from the two databases. A lot of biodiversity data are recorded based on the Darwin Core standard while DNA sequencing data are sometimes integrated or cross-linked by RDF (Resource Description Framework). It may not be technically difficult to integrate these data, but the species data referenced differ from the EoL (The Encyclopedia of Life) for BOLD and the NCBI taxonomy for GenBank, and the differences in taxonomic systems make it difficult to match by scientific name description. GenBank has fields for the latitude and longitude of the specimens sampled, and Porter and Hajibabaei 2018 argue that this information should be enhanced. However, this information may be better described in the specimen and occurrence databases. The integration of barcoding data with the specimen and occurrence data will solve these problems. Most importantly, it will save the researcher from having to register the same information in multiple databases. In the field of biodiversity, only DNA barcode sequences may have been focused on and used as gene sequences. The museomics community regards museum-preserved specimens as rich resources for DNA studies because their biodiversity information can accompany the extraction and analysis of their DNA (Nakazato 2018). GenBank is useful for biodiversity studies due to its low rate of mislabelling (Leray et al. 2019). In the future, we will be working with a variety of DNA, including genomes from museum specimens as well as DNA barcoding. This will require more integrated use of biodiversity information and DNA sequence data. This integration is also of interest to molecular biologists and bioinformaticians.


1984 ◽  
Vol 62 (12) ◽  
pp. 2677-2679 ◽  
Author(s):  
Donald W. Thomas ◽  
Stephen D. West

Several ultrasonic detection and analysis systems are currently used to provide information on the echolocation calls of bats, in many cases permitting species-level identification. This note briefly describes these systems and alerts potential users of the inaccuracies of the simplest device, the superheterodyne QMC Mini Bat Detector. Without adequate calibration, the error in this latter detector is such that reliable identification of bats by echolocation call characteristics cannot be achieved.


2021 ◽  
Vol 26 (1) ◽  
pp. 17-26
Author(s):  
Nenik Kholilah ◽  
Norma Afiati ◽  
Subagiyo Subagiyo

As per the FAO data, octopus identification is very limited in the species level at world fishery and also they are cryptic nature. On the other hand, Indonesia is one of the top ten highest octopus exporters. This study therefore aimed to determine the species of octopus based on phylogenetic analysis of mt-DNA COI. Octopuses were collected from nine different locations throughout Indonesia, i.e., Anambas, Bangka-Belitung, Cirebon, Karimunjawa, Tuban, Lombok, Buton, Wakatobi and Jayapura. Samples were mostly in the form of tentacles that were directly collected from fishermen. After being preserved in 96% ethanol, the sample was extracted in 10% chelexÒ, PCR amplification using Folmer’s primer then was further analysed by sequencing in Sanger methods. Of the 24 samples sequenced, the results recognized four species Octopodidae belongs to the three genera, named Amphioctopus aegina, Hapalochlaena fasciata, Octopus laqueus and Octopus cyanea. Mean pair-wise distances of within-species were ranged from 0 to 5.5 % and between-species was ranged from 12.9 to 15.8 %. This study distinctly confirmed the difference between genus Amphioctopus and Hapalochlaena (15.5 %), as also between O. laqueus and O. cyanea (12.9%) which was previously not completely distinguished. Although performing species identification using DNA sequences for shallow-water benthic octopus species is perhaps considered premature, this study indicated the possible application of COI sequences for species identification, thereby providing a preliminary dataset for future DNA barcoding of octopus, in particular for Indonesia waters.


2018 ◽  
Author(s):  
Aisha Tahir ◽  
Fatma Hussain ◽  
Nisar Ahmed ◽  
Abdolbaset Ghorbani ◽  
Amer Jamil

In pursuit of developing fast and accurate species level molecular identification methods, we tested six DNA barcodes viz. ITS2, matK, rbcLa, ITS2+matK, ITS2+rbcLa, matK+rbcLa, ITS2+matK+rbcLa for their capacity to identify frequently consumed but geographically isolated medicinal species of Fabaceae and Poaceae indigenous to the desert of Cholistan. Data were analysed by BLASTn sequence similarity, pairwise sequence divergence in TAXONDNA, and phylogenetic (neighbour-joining and maximum-likelihood trees) methods. Comparison of six barcode regions showed that ITS2 has the highest number of variable sites (209/360) for tested Fabaceae and (106/365) Poaceae species, the highest species level identification (40%) in BLASTn procedure, distinct DNA barcoding gap, 100% correct species identification in BM and BCM functions of TAXONDNA, and clear cladding pattern with high nodal support in phylogenetic trees in both families. ITS2+matK+rbcLa followed ITS2 in its species level identification capacity. The study was concluded with advocating the DNA barcoding as an effective tool for species identification and ITS2 as the best barcode region in identifying medicinal species of Fabaceae and Poaceae. Current research has practical implementation potential in the fields of pharmaco-vigilance, trade of medicinal plants and biodiversity conservation.


2018 ◽  
Author(s):  
Aisha Tahir ◽  
Fatma Hussain ◽  
Nisar Ahmed ◽  
Abdolbaset Ghorbani ◽  
Amer Jamil

In pursuit of developing fast and accurate species level molecular identification methods, we tested six DNA barcodes viz. ITS2, matK, rbcLa, ITS2+matK, ITS2+rbcLa, matK+rbcLa, ITS2+matK+rbcLa for their capacity to identify frequently consumed but geographically isolated medicinal species of Fabaceae and Poaceae indigenous to the desert of Cholistan. Data were analysed by BLASTn sequence similarity, pairwise sequence divergence in TAXONDNA, and phylogenetic (neighbour-joining and maximum-likelihood trees) methods. Comparison of six barcode regions showed that ITS2 has the highest number of variable sites (209/360) for tested Fabaceae and (106/365) Poaceae species, the highest species level identification (40%) in BLASTn procedure, distinct DNA barcoding gap, 100% correct species identification in BM and BCM functions of TAXONDNA, and clear cladding pattern with high nodal support in phylogenetic trees in both families. ITS2+matK+rbcLa followed ITS2 in its species level identification capacity. The study was concluded with advocating the DNA barcoding as an effective tool for species identification and ITS2 as the best barcode region in identifying medicinal species of Fabaceae and Poaceae. Current research has practical implementation potential in the fields of pharmaco-vigilance, trade of medicinal plants and biodiversity conservation.


PhytoKeys ◽  
2022 ◽  
Vol 188 ◽  
pp. 1-18
Author(s):  
Nguyen Nhat Linh ◽  
Pham Le Bich Hang ◽  
Huynh Thi Thu Hue ◽  
Nguyen Hai Ha ◽  
Ha Hong Hanh ◽  
...  

Certain species within the genus Panax L. (Araliaceae) contain pharmacological precious ginsenosides, also known as ginseng saponins. Species containing these compounds are of high commercial value and are thus of particular urgency for conservation. However, within this genus, identifying the particular species that contain these compounds by morphological means is challenging. DNA barcoding is one method that is considered promising for species level identification. However, in an evolutionarily complex genus such as Panax, commonly used DNA barcodes such as nrITS, matK, psbA-trnH, rbcL do not provide species-level resolution. A recent in silico study proposed a set of novel chloroplast markers, trnQ-rps16, trnS-trnG, petB, and trnE-trnT for species level identification within Panax. In the current study, the discriminatory efficiency of these molecular markers is assessed and validated using 91 reference barcoding sequences and 38 complete chloroplast genomes for seven species, one unidentified species and one sub-species of Panax, and two outgroup species of Aralia L. along with empirical data of Panax taxa present in Vietnam via both distance-based and tree-based methods. The obtained results show that trnQ-rps16 can classify with species level resolution every clade tested here, including the highly valuable Panax vietnamensis Ha et Grushv. We thus propose that this molecular marker to be used for identification of the species within Panax to support both its conservation and commercial trade.


2021 ◽  
Vol 8 ◽  
Author(s):  
Jisu Yeom ◽  
Nayeon Park ◽  
Raehyuk Jeong ◽  
Wonchoel Lee

MALDI Time-of-Flight Mass Spectrometry (MALDI-TOF MS) provides a fast and reliable alternative method for species-level identification of pathogens and various metazoans. Compared to the commonly used mitochondrial cytochrome c oxidase subunit I (mtCOI) barcoding, advantages of MALDI-TOF MS are rapid species identifications and low costs. In this study, we used MALDI-TOF MS to determine whether spectra patterns of different species can be used for species identification. We obtained a total of 138 spectra from individual specimens of Tigriopus, which were subsequently used for various cluster analyses. Our findings revealed these spectra form three clear clusters with high AU value support. This study validates the viability of MALDI-TOF MS as a methodology for higher-resolution species identification, allowing detection of cryptic species of harpacticoida. In addition, we propose a new species, Tigriopus koreanus sp. nov. by utilizing integrative methods such as morphological comparison, mtCOI barcoding, and MALDI-TOF MS.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Farshid O Sirjani ◽  
Edwin E Lewis

Abstract A new dipterous pest is reported, for the first time, on commercial pistachios from Sirjan, Kerman province, Iran. The genus of the insect was determined to be Resseliella Seitner (Diptera: Cecidomyiidae). Adults are light brown to brown in color and 0.8–1.5 mm in length with females, generally, slightly larger than males. Females have an elongated ovipositor, which is characteristic of the genus. Larvae are orange in color, 2–3 mm in length in the later instars, feed under bark without inducing galls, and cause branch dieback on trees of various ages. Brown to black discolorations are observed on plant tissues under bark where the larvae feed. Infestations observed on current and the previous—year’s growths, ranged from 0.5 to 1.2 cm in diameter, and all located in outer branches. Dry leaves and fruit clusters on infested branches remain attached, which may be used to recognize infestation by the gall midge. Dark-colored, sunken spots with splits on the bark located at the base of the wilted sections of the shoots also are symptoms of Resseliella sp. larval activity. Species-level identification of the gall midge is currently underway.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Bobby Lim-Ho Kong ◽  
Hyun-Seung Park ◽  
Tai-Wai David Lau ◽  
Zhixiu Lin ◽  
Tae-Jin Yang ◽  
...  

AbstractIlex is a monogeneric plant group (containing approximately 600 species) in the Aquifoliaceae family and one of the most commonly used medicinal herbs. However, its taxonomy and phylogenetic relationships at the species level are debatable. Herein, we obtained the complete chloroplast genomes of all 19 Ilex types that are native to Hong Kong. The genomes are conserved in structure, gene content and arrangement. The chloroplast genomes range in size from 157,119 bp in Ilex graciliflora to 158,020 bp in Ilex kwangtungensis. All these genomes contain 125 genes, of which 88 are protein-coding and 37 are tRNA genes. Four highly varied sequences (rps16-trnQ, rpl32-trnL, ndhD-psaC and ycf1) were found. The number of repeats in the Ilex genomes is mostly conserved, but the number of repeating motifs varies. The phylogenetic relationship among the 19 Ilex genomes, together with eight other available genomes in other studies, was investigated. Most of the species could be correctly assigned to the section or even series level, consistent with previous taxonomy, except Ilex rotunda var. microcarpa, Ilex asprella var. tapuensis and Ilex chapaensis. These species were reclassified; I. rotunda was placed in the section Micrococca, while the other two were grouped with the section Pseudoaquifolium. These studies provide a better understanding of Ilex phylogeny and refine its classification.


Sign in / Sign up

Export Citation Format

Share Document