scholarly journals Towards the extended barcode concept: Generating DNA reference data through genome skimming of danish plants

2021 ◽  
Author(s):  
Physilia Y.S Chua ◽  
Frederik Leerhoi ◽  
Emilia M.R Langkjaer ◽  
Ashot Margaryan ◽  
Christina L Noer ◽  
...  

Recently, there has been a push towards the extended barcode concept of utilising chloroplast genomes (cpGenome) and nuclear ribosomal DNA (nrDNA) sequences for molecular identification of plants instead of the standard barcode regions. These extended barcodes has a wide range of applications, including biodiversity monitoring and assessment, primer design, and evolutionary studies. However, these extended barcodes are not well represented in global reference databases. To fill this gap, we generated cpGenomes and nrDNA reference data from genome skims of 184 plant species collected in Denmark. We further explored the application of our generated reference data for molecular identifications of plants in an environmental DNA metagenomics study. We assembled partial cpGenomes for 82.1% of sequenced species and full or partial nrDNA sequences for 83.7% of species. We added all assemblies to GenBank, of which chloroplast reference data from 101 species and nuclear reference data from 6 species were not previously represented. On average, we recovered 45 genes per species. The rate of recovery of standard barcodes was higher for nuclear barcodes (>89%) than chloroplast barcodes (< 60%). Extracted DNA yield did not affect assembly outcome, whereas high GC content did so negatively. For the in silico simulation of metagenomic reads, taxonomic assignments using the reference data generated had better species resolution (94.9%) as compared to GenBank (18.1%) without any identification errors. Genome skimming generates reference data of both standard barcodes and other loci, contributing to the global DNA reference database for plants.

2021 ◽  
Vol 4 ◽  
Author(s):  
Virginie Marques ◽  
Tristan Milhau ◽  
Camille Albouy ◽  
Tony Dejean ◽  
Stéphanie Manel ◽  
...  

Environmental DNA metabarcoding has recently emerged as a non-invasive tool for aquatic biodiversity inventories, frequently surpassing traditional methods for detecting a wide range of taxa in most habitats. One of the major limitations currently impairing the large-scale application of DNA-based inventories, such as eDNA or bulk-sample analysis is the lack of species sequences available in public genetic databases. These gaps are still largely unknown spatially and taxonomically for most regions of the world, which can hinder targeted future sequencing efforts. We propose GAPeDNA, a user-friendly web-interface (Fig. 1) that provides a global overview of genetic database completeness for a given taxon across space and conservation status. As an initial application, we synthetized data from regional checklists for marine and freshwater fishes along with their IUCN conservation status to provide global maps of species coverage using the European Nucleotide Archive public reference database for 19 metabarcoding primers. This tool automatizes the scanning of gaps in these databases to guide future sequencing efforts and support the deployment of DNA-based inventories at larger scale. It is flexible and can be expanded to other taxa and primers upon data availability. Using our global fish case study, we show that gaps increase toward the tropics where species diversity and the number of threatened species were the highest. It highlights priority areas for fish sequencing like the Congo, the Mekong and the Mississippi freshwater basins which host more than 60 non-sequenced threatened fish species. For marine fishes, the Caribbean and East Africa host up to 42 non-sequenced threatened species. As an open-acces, updatable and flexible tool, GAPeDNA can be used to evaluate the completeness of sequence reference libraries of various markers and for any taxonomic group.


2021 ◽  
Vol 22 (4) ◽  
pp. 2104
Author(s):  
Pedro Robles ◽  
Víctor Quesada

Eleven published articles (4 reviews, 7 research papers) are collected in the Special Issue entitled “Organelle Genetics in Plants.” This selection of papers covers a wide range of topics related to chloroplasts and plant mitochondria research: (i) organellar gene expression (OGE) and, more specifically, chloroplast RNA editing in soybean, mitochondria RNA editing, and intron splicing in soybean during nodulation, as well as the study of the roles of transcriptional and posttranscriptional regulation of OGE in plant adaptation to environmental stress; (ii) analysis of the nuclear integrants of mitochondrial DNA (NUMTs) or plastid DNA (NUPTs); (iii) sequencing and characterization of mitochondrial and chloroplast genomes; (iv) recent advances in plastid genome engineering. Here we summarize the main findings of these works, which represent the latest research on the genetics, genomics, and biotechnology of chloroplasts and mitochondria.


2013 ◽  
Vol 103 (5) ◽  
pp. 479-487 ◽  
Author(s):  
Efrén Remesal ◽  
Blanca B. Landa ◽  
María del Mar Jiménez-Gasco ◽  
Juan A. Navas-Cortés

Populations of Sclerotium rolfsii, the causal organism of Sclerotium root-rot on a wide range of hosts, can be placed into mycelial compatibility groups (MCGs). In this study, we evaluated three different molecular approaches to unequivocally identify each of 12 previously identified MCGs. These included restriction fragment length polymorphism (RFLP) patterns of the internal transcribed spacer (ITS) region of nuclear ribosomal DNA (rDNA) and sequence analysis of two protein-coding genes: translation elongation factor 1α (EF1α) and RNA polymerase II subunit two (RPB2). A collection of 238 single-sclerotial isolates representing 12 MCGs of S. rolfsii were obtained from diseased sugar beet plants from Chile, Italy, Portugal, and Spain. ITS-RFLP analysis using four restriction enzymes (AluI, HpaII, RsaI, and MboI) displayed a low degree of variability among MCGs. Only three different restriction profiles were identified among S. rolfsii isolates, with no correlation to MCG or to geographic origin. Based on nucleotide polymorphisms, the RPB2 gene was more variable among MCGs compared with the EF1α gene. Thus, 10 of 12 MCGs could be characterized utilizing the RPB2 region only, while the EF1α region resolved 7 MCGs. However, the analysis of combined partial sequences of EF1α and RPB2 genes allowed discrimination among each of the 12 MCGs. All isolates belonging to the same MCG showed identical nucleotide sequences that differed by at least in one nucleotide from a different MCG. The consistency of our results to identify the MCG of a given S. rolfsii isolate using the combined sequences of EF1α and RPB2 genes was confirmed using blind trials. Our study demonstrates that sequence variation in the protein-coding genes EF1α and RPB2 may be exploited as a diagnostic tool for MCG typing in S. rolfsii as well as to identify previously undescribed MCGs.


2021 ◽  
Vol 4 ◽  
Author(s):  
Mélissa Jaquier ◽  
Camille Albouy ◽  
Wilhelmine Bach ◽  
Conor Waldock ◽  
Viriginie Marques ◽  
...  

Islands have traditionally served as model systems to study ecological and evolutionary processes (Warren et al. 2015) and could also represent a relevant system to study environmental DNA (eDNA). Isolated island reefs that are affected by climatic threats would particularly benefit from cost- and time-efficient biodiversity surveys to set priorities for their conservation. Among time efficiency methods, eDNA has emerged as a novel molecular metabarcoding technique to detect biodiversity from simple environmental samples even in remote marine environments. However, eDNA monitoring techniques for marine environments are at a developmental phase, with a few remaining unknowns related to DNA residence time and movement. In particular, the redistribution of eDNA, via ocean currents, could blur the composition signal and its association with local environmental conditions (Goldberg et al. 2016). Here, we investigated the detection variation of eDNA along a distance gradient across four islands in the French Scattered Islands. We collected 30 L of surface water per filter at an increasing distance from the islands reefs (0m, 250m, 500m, 750m). Using a metabarcoding protocol, we used the teleo primers to target a fraction of 12S mitochondrial DNA to detect Actinopterygii and Elasmobranchii. We then applied a sequence clustering approach to generate Molecular Taxonomic Units (MOTUs), which were assigned to a taxonomic group using a reference database. By assigning eDNA sequences to species using a public reference database, we classified species according to their preferred habitat types between benthic/demersal and pelagic. Our results show no significant relationship between distance and MOTUs richness for both habitat types. By using a Joint Species Distribution Modelling approach (JSDM, Hierarchical Modelling of Species Communities), we retained the multidimensional information captured by eDNA and detect species- and family-specific responses to distance (Fig. 1). We showed that benthic MOTUs were found in closer proximity to the reef, while typical pelagic MOTUs were found at greater distances from the reef. Hence, MOTU-level analyses coupled with JSDM were more informative that when aggregating it into coarser richness. Altogether, our eDNA distance sampling gradient detected an ecological signal of habitat selection by fish species, which suggest that eDNA could help understand the behavior of species and their distribution in marine environments at a fine spatial scale.


Author(s):  
Nicole Foster ◽  
Kor-jent Dijk ◽  
Ed Biffin ◽  
Jennifer Young ◽  
Vicki Thomson ◽  
...  

A proliferation in environmental DNA (eDNA) research has increased the reliance on reference sequence databases to assign unknown DNA sequences to known taxa. Without comprehensive reference databases, DNA extracted from environmental samples cannot be correctly assigned to taxa, limiting the use of this genetic information to identify organisms in unknown sample mixtures. For animals, standard metabarcoding practices involve amplification of the mitochondrial Cytochrome-c oxidase subunit 1 (CO1) region, which is a universally amplifyable region across majority of animal taxa. This region, however, does not work well as a DNA barcode for plants and fungi, and there is no similar universal single barcode locus that has the same species resolution. Therefore, generating reference sequences has been more difficult and several loci have been suggested to be used in parallel to get to species identification. For this reason, we developed a multi-gene targeted capture approach to generate reference DNA sequences for plant taxa across 20 target chloroplast gene regions in a single assay. We successfully compiled a reference database for 93 temperate coastal plants including seagrasses, mangroves, and saltmarshes/samphire’s. We demonstrate the importance of a comprehensive reference database to prevent species going undetected in eDNA studies. We also investigate how using multiple chloroplast gene regions impacts the ability to discriminate between taxa.


2021 ◽  
Vol 12 ◽  
Author(s):  
Abbas Jamal ◽  
Jun Wen ◽  
Zhi-Yao Ma ◽  
Ibrar Ahmed ◽  
Abdullah ◽  
...  

Chimonanthus of Calycanthaceae is a small endemic genus in China, with unusual winter-blooming sweet flowers widely cultivated for ornamentals and medicinal uses. The evolution of Chimonanthus plastomes and its phylogenetic relationships remain unresolved due to limited availability of genetic resources. Here, we report fully assembled and annotated chloroplast genomes of five Chimonanthus species. The chloroplast genomes of the genus (size range 153,010 – 153,299 bp) reveal high similarities in gene content, gene order, GC content, codon usage, amino acid frequency, simple sequence repeats, oligonucleotide repeats, synonymous and non-synonymous substitutions, and transition and transversion substitutions. Signatures of positive selection are detected in atpF and rpoB genes in C. campanulatus. The correlations among substitutions, InDels, and oligonucleotide repeats reveal weak to strong correlations in distantly related species at the intergeneric levels, and very weak to weak correlations among closely related Chimonanthus species. Chloroplast genomes are used to reconstruct a well-resolved phylogenetic tree, which supports the monophyly of Chimonanthus. Within Chimonanthus, C. praecox and C. campanulatus form one clade, while C. grammatus, C. salicifolius, C. zhejiangensis, and C. nitens constitute another clade. Chimonanthus nitens appears paraphyletic and is closely related to C. salicifolius and C. zhejiangensis, suggesting the need to reevaluate the species delimitation of C. nitens. Chimonanthus and Calycanthus diverged in mid-Oligocene; the radiation of extant Chimonanthus species was dated to the mid-Miocene, while C. grammatus diverged from other Chimonanthus species in the late Miocene. C. salicifolius, C. nitens(a), and C. zhejiangensis are inferred to have diverged in the Pleistocene of the Quaternary period, suggesting recent speciation of a relict lineage in the subtropical forest regions in eastern China. This study provides important insights into the chloroplast genome features and evolutionary history of Chimonanthus and family Calycanthaceae.


2020 ◽  
Vol 12 (4) ◽  
pp. 3229-3246
Author(s):  
Magí Franquesa ◽  
Melanie K. Vanderhoof ◽  
Dimitris Stavrakoudis ◽  
Ioannis Z. Gitas ◽  
Ekhi Roteta ◽  
...  

Abstract. Over the past 2 decades, several global burned area products have been produced and released to the public. However, the accuracy assessment of such products largely depends on the availability of reliable reference data that currently do not exist on a global scale or whose production require a high level of dedication of project resources. The important lack of reference data for the validation of burned area products is addressed in this paper. We provide the Burned Area Reference Database (BARD), the first publicly available database created by compiling existing reference BA (burned area) datasets from different international projects. BARD contains a total of 2661 reference files derived from Landsat and Sentinel-2 imagery. All those files have been checked for internal quality and are freely provided by the authors. To ensure database consistency, all files were transformed to a common format and were properly documented by following metadata standards. The goal of generating this database was to give BA algorithm developers and product testers reference information that would help them to develop or validate new BA products. BARD is freely available at https://doi.org/10.21950/BBQQU7 (Franquesa et al., 2020).


2021 ◽  
Author(s):  
Gert-Jan Jeunen ◽  
Tatsiana Lipinskaya ◽  
Helen Gajduchenko ◽  
Viktoriya Golovenchik ◽  
Michail Moroz ◽  
...  

Active environmental DNA (eDNA) surveillance through species-specific amplification has shown increased sensitivity in the detection of non-indigenous species (NIS) compared to traditional approaches. When many NIS are of interest, however, active surveillance decreases in cost- and time-efficiency. Passive surveillance through eDNA metabarcoding takes advantage of the complex DNA signal in environmental samples and facilitates the simultaneous detection of multiple species. While passive eDNA surveillance has previously detected NIS, comparative studies are essential to determine the ability of eDNA metabarcoding to accurately describe the range of invasion for multiple NIS versus alternative approaches. Here, we surveyed twelve sites, covering nine rivers across Belarus for NIS with three different techniques, i.e., an ichthyological, hydrobiological, and eDNA survey, whereby DNA was extracted from 500 mL surface water samples and amplified with two 16S rRNA primer assays targeting the fish and macro-invertebrate biodiversity. Nine non-indigenous fish and ten non-indigenous sediment-living macro-invertebrates were detected by traditional surveys, while seven NIS eDNA signals were picked up, including four fish, one aquatic and two sediment-living macro-invertebrates. Passive eDNA surveillance extended the range of invasion further north for two invasive fish and identified a new NIS for Belarus, the freshwater jellyfish Craspedacusta sowerbii. False-negative detections for the eDNA survey could be attributed to (i) preferential amplification of aquatic over sediment-living macro-invertebrates from surface water samples and (ii) an incomplete reference database. The evidence provided in this study recommends the implementation of both molecular-based and traditional approaches to maximize the probability of early detection of non-native organisms.


Molecules ◽  
2018 ◽  
Vol 23 (9) ◽  
pp. 2137 ◽  
Author(s):  
Xiang-Xiao Meng ◽  
Yan-Fang Xian ◽  
Li Xiang ◽  
Dong Zhang ◽  
Yu-Hua Shi ◽  
...  

The genus Sanguisorba, which contains about 30 species around the world and seven species in China, is the source of the medicinal plant Sanguisorba officinalis, which is commonly used as a hemostatic agent as well as to treat burns and scalds. Here we report the complete chloroplast (cp) genome sequences of four Sanguisorba species (S. officinalis, S. filiformis, S. stipulata, and S. tenuifolia var. alba). These four Sanguisorba cp genomes exhibit typical quadripartite and circular structures, and are 154,282 to 155,479 bp in length, consisting of large single-copy regions (LSC; 84,405–85,557 bp), small single-copy regions (SSC; 18,550–18,768 bp), and a pair of inverted repeats (IRs; 25,576–25,615 bp). The average GC content was ~37.24%. The four Sanguisorba cp genomes harbored 112 different genes arranged in the same order; these identical sections include 78 protein-coding genes, 30 tRNA genes, and four rRNA genes, if duplicated genes in IR regions are counted only once. A total of 39–53 long repeats and 79–91 simple sequence repeats (SSRs) were identified in the four Sanguisorba cp genomes, which provides opportunities for future studies of the population genetics of Sanguisorba medicinal plants. A phylogenetic analysis using the maximum parsimony (MP) method strongly supports a close relationship between S. officinalis and S. tenuifolia var. alba, followed by S. stipulata, and finally S. filiformis. The availability of these cp genomes provides valuable genetic information for future studies of Sanguisorba identification and provides insights into the evolution of the genus Sanguisorba.


Sign in / Sign up

Export Citation Format

Share Document