scholarly journals Toward a global reference database of COI barcodes for marine zooplankton

2021 ◽  
Vol 168 (6) ◽  
Author(s):  
Ann Bucklin ◽  
Katja T. C. A. Peijnenburg ◽  
Ksenia N. Kosobokova ◽  
Todd D. O’Brien ◽  
Leocadio Blanco-Bercial ◽  
...  

AbstractCharacterization of species diversity of zooplankton is key to understanding, assessing, and predicting the function and future of pelagic ecosystems throughout the global ocean. The marine zooplankton assemblage, including only metazoans, is highly diverse and taxonomically complex, with an estimated ~28,000 species of 41 major taxonomic groups. This review provides a comprehensive summary of DNA sequences for the barcode region of mitochondrial cytochrome oxidase I (COI) for identified specimens. The foundation of this summary is the MetaZooGene Barcode Atlas and Database (MZGdb), a new open-access data and metadata portal that is linked to NCBI GenBank and BOLD data repositories. The MZGdb provides enhanced quality control and tools for assembling COI reference sequence databases that are specific to selected taxonomic groups and/or ocean regions, with associated metadata (e.g., collection georeferencing, verification of species identification, molecular protocols), and tools for statistical analysis, mapping, and visualization. To date, over 150,000 COI sequences for ~ 5600 described species of marine metazoan plankton (including holo- and meroplankton) are available via the MZGdb portal. This review uses the MZGdb as a resource for summaries of COI barcode data and metadata for important taxonomic groups of marine zooplankton and selected regions, including the North Atlantic, Arctic, North Pacific, and Southern Oceans. The MZGdb is designed to provide a foundation for analysis of species diversity of marine zooplankton based on DNA barcoding and metabarcoding for assessment of marine ecosystems and rapid detection of the impacts of climate change.

Author(s):  
Nicole Foster ◽  
Kor-jent Dijk ◽  
Ed Biffin ◽  
Jennifer Young ◽  
Vicki Thomson ◽  
...  

A proliferation in environmental DNA (eDNA) research has increased the reliance on reference sequence databases to assign unknown DNA sequences to known taxa. Without comprehensive reference databases, DNA extracted from environmental samples cannot be correctly assigned to taxa, limiting the use of this genetic information to identify organisms in unknown sample mixtures. For animals, standard metabarcoding practices involve amplification of the mitochondrial Cytochrome-c oxidase subunit 1 (CO1) region, which is a universally amplifyable region across majority of animal taxa. This region, however, does not work well as a DNA barcode for plants and fungi, and there is no similar universal single barcode locus that has the same species resolution. Therefore, generating reference sequences has been more difficult and several loci have been suggested to be used in parallel to get to species identification. For this reason, we developed a multi-gene targeted capture approach to generate reference DNA sequences for plant taxa across 20 target chloroplast gene regions in a single assay. We successfully compiled a reference database for 93 temperate coastal plants including seagrasses, mangroves, and saltmarshes/samphire’s. We demonstrate the importance of a comprehensive reference database to prevent species going undetected in eDNA studies. We also investigate how using multiple chloroplast gene regions impacts the ability to discriminate between taxa.


Genome ◽  
2019 ◽  
Vol 62 (3) ◽  
pp. 160-169 ◽  
Author(s):  
Wieland Meyer ◽  
Laszlo Irinyi ◽  
Minh Thuy Vi Hoang ◽  
Vincent Robert ◽  
Dea Garcia-Hermoso ◽  
...  

With new or emerging fungal infections, human and animal fungal pathogens are a growing threat worldwide. Current diagnostic tools are slow, non-specific at the species and subspecies levels, and require specific morphological expertise to accurately identify pathogens from pure cultures. DNA barcodes are easily amplified, universal, short species-specific DNA sequences, which enable rapid identification by comparison with a well-curated reference sequence collection. The primary fungal DNA barcode, ITS region, was introduced in 2012 and is now routinely used in diagnostic laboratories. However, the ITS region only accurately identifies around 75% of all medically relevant fungal species, which has prompted the development of a secondary barcode to increase the resolution power and suitability of DNA barcoding for fungal disease diagnostics. The translational elongation factor 1α (TEF1α) was selected in 2015 as a secondary fungal DNA barcode, but it has not been implemented into practice, due to the absence of a reference database. Here, we have established a quality-controlled reference database for the secondary barcode that together with the ISHAM-ITS database, forms the ISHAM barcode database, available online at http://its.mycologylab.org/ . We encourage the mycology community for active contributions.


2007 ◽  
Vol 79 (3) ◽  
pp. 369-379 ◽  
Author(s):  
Rubens M. Lopes

Marine zooplankton research in Brazil has been primarily descriptive, with most studies focusing on community structure analysis and related issues. The composition and spatial distribution of several taxonomic groups are currently well known, although less-abundant and small-sized taxa as well as initial stages of almost all species have received little attention. Some numerically important taxa such as heterotrophic protists, ctenophores, acoel turbellarians and ostracods remain virtually unstudied. Large sectors of the continental shelf have not been sampled in detail, particularly those areas influenced by the North Brazil Current (5ºN-15ºS). Zooplankton abundance and biomass in offshore waters have seldom been quantified, and information on the distribution and vertical migration of meso- and bathypelagic species are lacking. Additional faunistic assessments must target those less-studied taxa and geographical locations. However, priority in ecological studies should be given to process-oriented investigations aimed at understanding the mechanisms controlling zooplankton distribution, trophic interactions within pelagic food webs and production cycles in relation to the physical environment. An effort should be made to incorporate state-of-the-art sampling technology and analytical methods into future research projects.


2021 ◽  
Vol 4 ◽  
Author(s):  
François Keck ◽  
Florian Altermatt

Reference databases of sequences that have been taxonomically assigned are a key element for DNA-based identification of organisms. Accurate and complete reference databases are necessary to associate a correct taxonomic name to the sequences obtained in studies using metabarcoding. Today many research projects using DNA metabarcoding include the development of a custom reference database, often derived from large repositories like GenBank. At the same time, many projects are focussing on the development of ready-to-use databases validated by experts and targeting specific markers and taxonomic groups. While mainstream tools such as spreadsheet softwares may be suitable to manage small databases, they quickly become insufficient when the amount of data increases and validation operations become more complex. There is a clear need for providing user‐friendly and powerful tools to manipulate biological sequences and manage reference databases. The R language which is a free software and has already been adopted by many researchers to perform their analyses is highly suitable to develop such tools. In this talk, we will outline the approach we recommend to handle small- to middle-sized reference databases, currently still making the majority of projects. We will advocate that a simple tabular approach where each sequence constitutes an observation may be the most adequate. While such a single table may be less flexible and less optimized than relational databases or more complex data structures, it is easy to maintain and allows the direct use of modern dataframe centric tools. We will specifically present and discuss two R packages that can be used jointly to make reference database development more accessible and more reproducible. First, we will briefly introduce bioseq (Keck 2020) which is dedicated to biological sequence manipulation and analysis. The package implements classes and functions to make analyses of complex datasets including DNA, RNA or protein sequences as simple as possible. The strength of bioseq is to provide standard and more advanced functions to perform low level operations through a simple and consistent programming interface. Then we will present refdb, which has been developed as an environment for semi-automatic and assisted construction of reference databases. The refdb package is a reference database manager offering a set of powerful functions to import, organize, clean, filter, audit and export the data. We will outline how these two packages together can speed up reference database generation and handling, and contribute to standardization and repeatability in metabarcoding studies.


2020 ◽  
Vol 8 ◽  
Author(s):  
Dagoberto Venera-Pontón ◽  
Amy Driskell ◽  
Sammy De Grave ◽  
Darryl Felder ◽  
Justin Scioli ◽  
...  

DNA barcoding is a useful tool to identify the components of mixed or bulk samples, as well as to determine individuals that lack morphologically diagnostic features. However, the reference database of DNA barcode sequences is particularly sparsely populated for marine invertebrates and for tropical taxa. We used samples collected as part of two field courses, focused on graduate training in taxonomy and systematics, to generate DNA sequences of the barcode fragments of cytochrome c oxidase subunit I (COI) and mitochondrial ribosomal 16S genes for 447 individuals, representing at least 129 morphospecies of decapod crustaceans. COI sequences for 36% (51/140) of the species and 16S sequences for 26% (37/140) of the species were new to GenBank. Automatic Barcode Gap Discovery identified 140 operational taxonomic units (OTUs) which largely coincided with the morphospecies delimitations. Barcode identifications (i.e. matches to identified sequences) were especially useful for OTUs within Synalpheus, a group that is notoriously difficult to identify and rife with cryptic species, a number of which we could not identify to species, based on morphology. Non-concordance between morphospecies and barcode OTUs also occurred in a few cases of suspected cryptic species. As mitochondrial pseudogenes are particularly common in decapods, we investigate the potential for this dataset to include pseudogenes and discuss the utility of these sequences as species identifiers (i.e. barcodes). These results demonstrate that material collected and identified during training activities can provide useful incidental barcode reference samples for under-studied taxa.


2016 ◽  
Author(s):  
Panu Somervuo ◽  
Douglas Yu ◽  
Charles Xu ◽  
Yinqiu Ji ◽  
Jenni Hultman ◽  
...  

AbstractA crucial step in the use of DNA markers for biodiversity surveys is the assignment of Linnaean taxonomies (species, genus, etc.) to sequence reads. This allows the use of all the information known based on the taxonomic names. Taxonomic placement of DNA barcoding sequences is inherently probabilistic because DNA sequences contain errors, because there is natural variation among sequences within a species, and because reference databases are incomplete and can have false annotations. However, most existing bioinformatics methods for taxonomic placement either exclude uncertainty, or quantify it using metrics other than probability.In this paper we evaluate the performance of a recently proposed probabilistic taxonomic placement method PROTAX by applying it to both annotated reference sequence data as well as unknown environmental data. Our four case studies include contrasting taxonomic groups (fungi, bacteria, mammals, and insects), variation in the length and quality of the barcoding sequences (from individually Sanger-sequenced sequences to short Illumina reads), variation in the structures and sizes of the taxonomies (from 800 to 130 000 species), and variation in the completeness of the reference databases (representing 15% to 100% of the species).Our results demonstrate that PROTAX yields essentially unbiased assessment of probabilities of taxonomic placement, and thus that its quantification of species identification uncertainty is reliable. As expected, the accuracy of taxonomic placement increases with increasing coverage of taxonomic and reference sequence databases, and with increasing ratio of genetic variation among taxonomic levels over within taxonomic levels.Our results show that reliable species-level identification from environmental samples is still challenging, and thus neglecting identification uncertainty can lead to spurious inference. A key aim for future research is the completion and pruning of taxonomic and reference sequence databases, and making these two types of data compatible.


2014 ◽  
Vol 31 (2) ◽  
Author(s):  
Jose Antonio Moreira Lima

This paper is concerned with the planning, implementation and some results of the Oceanographic Modeling and Observation Network, named REMO, for Brazilian regional waters. Ocean forecasting has been an important scientific issue over the last decade due to studies related to climate change as well as applications related to short-range oceanic forecasts. The South Atlantic Ocean has a deficit of oceanographic measurements when compared to other ocean basins such as the North Atlantic Ocean and the North Pacific Ocean. It is a challenge to design an ocean forecasting system for a region with poor observational coverage of in-situ data. Fortunately, most ocean forecasting systems heavily rely on the assimilation of surface fields such as sea surface height anomaly (SSHA) or sea surface temperature (SST), acquired by environmental satellites, that can accurately provide information that constrain major surface current systems and their mesoscale activity. An integrated approach is proposed here in which the large scale circulation in the Atlantic Ocean is modeled in a first step, and gradually nested into higher resolution regional models that are able to resolve important processes such as the Brazil Current and associated mesoscale variability, continental shelf waves, local and remote wind forcing, and others. This article presents the overall strategy to develop the models using a network of Brazilian institutions and their related expertise along with international collaboration. This work has some similarity with goals of the international project Global Ocean Data Assimilation Experiment OceanView (GODAE OceanView).


2021 ◽  
Vol 5 (2) ◽  
Author(s):  
Olivia M Gearner ◽  
Marcin J Kamiński ◽  
Kojun Kanda ◽  
Kali Swichtenberg ◽  
Aaron D Smith

Abstract Sepidiini is a speciose tribe of desert-inhabiting darkling beetles, which contains a number of poorly defined taxonomic groups and is in need of revision at all taxonomic levels. In this study, two previously unrecognized lineages were discovered, based on morphological traits, among the extremely speciose genera Psammodes Kirby, 1819 (164 species and subspecies) and Ocnodes Fåhraeus, 1870 (144 species and subspecies), namely the Psammodes spinosus species-group and Ocnodes humeralis species-group. In order to test their phylogenetic placement, a phylogeny of the tribe was reconstructed based on analyses of DNA sequences from six nonoverlapping genetic loci (CAD, wg, COI JP, COI BC, COII, and 28S) using Bayesian and maximum likelihood inference methods. The aforementioned, morphologically defined, species-groups were recovered as distinct and well-supported lineages within Molurina + Phanerotomeina and are interpreted as independent genera, respectively, Tibiocnodes Gearner & Kamiński gen. nov. and Tuberocnodes Gearner & Kamiński gen. nov. A new species, Tuberocnodes synhimboides Gearner & Kamiński sp. nov., is also described. Furthermore, as the recovered phylogenetic placement of Tibiocnodes and Tuberocnodes undermines the monophyly of Molurina and Phanerotomeina, an analysis of the available diagnostic characters for those subtribes is also performed. As a consequence, Phanerotomeina is considered as a synonym of the newly redefined Molurina sens. nov. Finally, spectrograms of vibrations produced by substrate tapping of two Molurina species, Toktokkus vialis (Burchell, 1822) and T. synhimboides, are presented.


Botany ◽  
2014 ◽  
Vol 92 (12) ◽  
pp. 901-910 ◽  
Author(s):  
Joel P. Olfelt ◽  
William A. Freyman

Taxa of Rhodiola L. (Crassulaceae) generally grow in arctic or alpine habitats. Some Rhodiola species are used medicinally, one taxon, Rhodiola integrifolia Raf. subsp. leedyi (Rosend. & J.W.Moore) Moran, (Leedy’s roseroot), is rare and endangered, and the group’s biogeography in North America is intriguing because of distributional disjunctions and the possibility that Rhodiola rhodantha (A.Gray) H.Jacobsen (2n = 7II) and Rhodiola rosea L. (2n = 11II) hybridized to form Rhodiola integrifolia Raf. (2n = 18II). Recent studies of the North American Rhodiola suggest that the group’s current taxonomy is misleading. We analyzed nuclear and chloroplast DNA sequences (internal transcribed spacer (ITS), trnL intron, trnL–trnF spacer, trnS–trnG spacer) from the North American Rhodiola taxa. We combined our data with GenBank sequences from Asian Rhodiola species, performed parsimony, maximum likelihood (ML), and Bayesian phylogenetic analyses, and applied a Bayesian clock model to the ITS data. Our analyses reveal two major Rhodiola clades, suggest that hybridization between R. rhodantha and R. rosea lineages was possible, show two distinct clades within R. integrifolia, and demonstrate that a Black Hills, South Dakota, Rhodiola population should be reclassified as Leedy’s roseroot. We recommend that R. integrifolia be revised, and that the Black Hills Leedy’s roseroot population be managed as part of that rare and endangered taxon.


2021 ◽  
Author(s):  
Lauren E. Manck ◽  
Jiwoon Park ◽  
Benjamin J. Tully ◽  
Alfonso M. Poire ◽  
Randelle M. Bundy ◽  
...  

AbstractIt is now widely accepted that siderophores play a role in marine iron biogeochemical cycling. However, the mechanisms by which siderophores affect the availability of iron from specific sources and the resulting significance of these processes on iron biogeochemical cycling as a whole have remained largely untested. In this study, we develop a model system for testing the effects of siderophore production on iron bioavailability using the marine copiotroph Alteromonas macleodii ATCC 27126. Through the generation of the knockout cell line ΔasbB::kmr, which lacks siderophore biosynthetic capabilities, we demonstrate that the production of the siderophore petrobactin enables the acquisition of iron from mineral sources and weaker iron-ligand complexes. Notably, the utilization of lithogenic iron, such as that from atmospheric dust, indicates a significant role for siderophores in the incorporation of new iron into marine systems. We have also detected petrobactin, a photoreactive siderophore, directly from seawater in the mid-latitudes of the North Pacific and have identified the biosynthetic pathway for petrobactin in bacterial metagenome-assembled genomes widely distributed across the global ocean. Together, these results improve our mechanistic understanding of the role of siderophore production in iron biogeochemical cycling in the marine environment wherein iron speciation, bioavailability, and residence time can be directly influenced by microbial activities.


Sign in / Sign up

Export Citation Format

Share Document