scholarly journals Phytool, a ShinyApp to homogenise taxonomy of freshwater microalgae from DNA barcodes and microscopic observations

2021 ◽  
Vol 5 ◽  
Author(s):  
Alexis Canino ◽  
Agnès Bouchez ◽  
Christophe Laplace-Treyture ◽  
Isabelle Domaizon ◽  
Frédéric Rimet

Methods for biomonitoring of freshwater phytoplankton are evolving rapidly with eDNA-based methods, offering great complementarity with microscopy. Metabarcoding approaches have been more commonly used over the last years, with a continuous increase in the amount of data generated. Depending on the researchers and the way they assigned barcodes to species (bioinformatic pipelines and molecular reference databases), the taxonomic assignment obtained for HTS DNA reads might vary. This is also true for traditional taxonomic studies by microscopy with regular adjustments of the classification and taxonomy. For those reasons (leading to non-homogeneous taxonomies), gap-analyses and comparisons between studies become even more challenging and the curation processes to find potential consensus names are time-consuming. Here, we present a web-based application (Phytool), developed with ShinyApp (Rstudio), that aims to make the harmonisation of taxonomy easier and in a more efficient way, using a complete and up-to-date taxonomy reference database for freshwater microalgae. Phytool allows users to homogenise and update freshwater phytoplankton taxonomical names from sequence files and data tables directly uploaded in the application. It also gathers barcodes from curated references in a user-friendly way in which it is possible to search for specific organisms. All the data provided are downloadable with the possibility to apply filters in order to select only the required taxa and fields (e.g. specific taxonomic ranks). The main goal is to make accessible to a broad range of users the connection between microscopy and molecular biology and taxonomy through different ready-to-use functions. This study estimates that only 25% of species of freshwater phytoplankton in Phytobs are associated with a barcode. We plead for an increased effort to enrich reference databases by coupling taxonomy and molecular methods. Phytool should make this crucial work more efficient. The application is available at https://caninuzzo.shinyapps.io/phytool_v1/

2021 ◽  
Vol 4 ◽  
Author(s):  
François Keck ◽  
Florian Altermatt

Reference databases of sequences that have been taxonomically assigned are a key element for DNA-based identification of organisms. Accurate and complete reference databases are necessary to associate a correct taxonomic name to the sequences obtained in studies using metabarcoding. Today many research projects using DNA metabarcoding include the development of a custom reference database, often derived from large repositories like GenBank. At the same time, many projects are focussing on the development of ready-to-use databases validated by experts and targeting specific markers and taxonomic groups. While mainstream tools such as spreadsheet softwares may be suitable to manage small databases, they quickly become insufficient when the amount of data increases and validation operations become more complex. There is a clear need for providing user‐friendly and powerful tools to manipulate biological sequences and manage reference databases. The R language which is a free software and has already been adopted by many researchers to perform their analyses is highly suitable to develop such tools. In this talk, we will outline the approach we recommend to handle small- to middle-sized reference databases, currently still making the majority of projects. We will advocate that a simple tabular approach where each sequence constitutes an observation may be the most adequate. While such a single table may be less flexible and less optimized than relational databases or more complex data structures, it is easy to maintain and allows the direct use of modern dataframe centric tools. We will specifically present and discuss two R packages that can be used jointly to make reference database development more accessible and more reproducible. First, we will briefly introduce bioseq (Keck 2020) which is dedicated to biological sequence manipulation and analysis. The package implements classes and functions to make analyses of complex datasets including DNA, RNA or protein sequences as simple as possible. The strength of bioseq is to provide standard and more advanced functions to perform low level operations through a simple and consistent programming interface. Then we will present refdb, which has been developed as an environment for semi-automatic and assisted construction of reference databases. The refdb package is a reference database manager offering a set of powerful functions to import, organize, clean, filter, audit and export the data. We will outline how these two packages together can speed up reference database generation and handling, and contribute to standardization and repeatability in metabarcoding studies.


2021 ◽  
Vol 4 ◽  
Author(s):  
Cristina Claver ◽  
Oriol Canals ◽  
Naiara Rodriguez-Ezpeleta

Environmental DNA (eDNA) metabarcoding, the process of sequencing DNA collected from the environment for producing biodiversity inventories, is increasingly being applied to assess fish diversity and distribution in marine environments. Yet, the successful application of this technique deeply relies on accurate and complete reference databases used for taxonomic assignment. The most used markers for fish eDNA metabarcoding studies are the cytochrome C oxidase subunit 1 (COI), 16S ribosomal RNA (16S), the 12S ribosomal RNA (12S) and cytochrome b (cyt b) genes, whose sequences are usually retrieved from GenBank, the largest DNA sequence database that represents a worldwide public resource for genetic studies. Thus, the completeness and accuracy of GenBank is critical to derive reliable estimations from fish eDNA metabarcoding data. Here, we have i) compiled the checklist of European marine fishes, ii) performed a gap analysis of the four genes and, within COI and 12S, also of the most used barcodes for fish, and iii) developed a workflow to detect potentially incorrect records in GenBank. We found that from the 1965 species in the checklist (1761 Actinopterygii, 189 Elasmobranchii, 9 Holocephali, 4 Petromyzonti and 2 Myxini), about 70% have sequences for COI, whereas less have sequences for 12S, 16S and cyt b (45-55%). Among the species for which COI ad 12S sequences are available, about 60% and 40% have sequences covering the most used barcodes respectively. The analysis of pairwise distances between sequences revealed pairs belonging to the same species with significantly low similarity and pairs belonging to different high level taxonomic groups (class, order) with significantly large similarity. In light of this further confirmation of presence of a substantial number of incorrect records in GenBank, we propose a method for identifying and removing spurious sequences to create reliable and accurate reference databases for eDNA metabarcoding.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11865
Author(s):  
Dylan Catlett ◽  
Kevin Son ◽  
Connie Liang

Background High-throughput sequencing of phylogenetically informative marker genes is a widely used method to assess the diversity and composition of microbial communities. Taxonomic assignment of sampled marker gene sequences (referred to as amplicon sequence variants, or ASVs) imparts ecological significance to these genetic data. To assign taxonomy to an ASV, a taxonomic assignment algorithm compares the ASV to a collection of reference sequences (a reference database) with known taxonomic affiliations. However, many taxonomic assignment algorithms and reference databases are available, and the optimal algorithm and database for a particular scientific question is often unclear. Here, we present the ensembleTax R package, which provides an efficient framework for integrating taxonomic assignments predicted with any number of taxonomic assignment algorithms and reference databases to determine ensemble taxonomic assignments for ASVs. Methods The ensembleTax R package relies on two core algorithms: taxmapper and assign.ensembleTax. The taxmapper algorithm maps taxonomic assignments derived from one reference database onto the taxonomic nomenclature (a set of taxonomic naming and ranking conventions) of another reference database. The assign.ensembleTax algorithm computes ensemble taxonomic assignments for each ASV in a data set based on any number of taxonomic assignments determined with independent methods. Various parameters allow analysts to prioritize obtaining either more ASVs with more predicted clade names or more robust clade name predictions supported by multiple independent methods in ensemble taxonomic assignments. Results The ensembleTax R package is used to compute two sets of ensemble taxonomic assignments for a collection of protistan ASVs sampled from the coastal ocean. Comparisons of taxonomic assignments predicted by individual methods with those predicted by ensemble methods show that conservative implementations of the ensembleTax package minimize disagreements between taxonomic assignments predicted by individual and ensemble methods, but result in ASVs with fewer ranks assigned taxonomy. Less conservative implementations of the ensembleTax package result in an increased fraction of ASVs classified at all taxonomic ranks, but increase the number of ASVs for which ensemble assignments disagree with those predicted by individual methods. Discussion We discuss how implementation of the ensembleTax R package may be optimized to address specific scientific objectives based on the results of the application of the ensembleTax package to marine protist communities. While further work is required to evaluate the accuracy of ensemble taxonomic assignments relative to taxonomic assignments predicted by individual methods, we also discuss scenarios where ensemble methods are expected to improve the accuracy of taxonomy prediction for ASVs.


2017 ◽  
Vol 1 (1) ◽  
pp. 44-49
Author(s):  
Nur Azizah ◽  
Dedeh Supriyanti ◽  
Siti Fairuz Aminah Mustapha ◽  
Holly Yang

In a company, the process of income and expense of money must have a profit-generating goal base. The success of financial management within the company, can be monitored from the ability of the financial management in managing the finances and utilize all the opportunities that exist with as much as possible with the aim to control the company's cash (cash flow) and the impact of generating profits in accordance with expectations. With a web-based online accounting system version 2.0, companies can be given the ease to manage money in and out of the company's cash. It has a user friendly system with navigation that makes it easy for the financial management to use it. Starting from the creation of a company's cash account used as a cash account and corporate bank account on the system, deletion or filing of cash accounts, up to the transfer invoice creation feature, receive and send money. Thus, this system is very effective and efficient in the management of income and corporate cash disbursements.   Keywords:​Accounting Online System, Financial Management, Cash and Bank


2018 ◽  
Vol 3 (1) ◽  
Author(s):  
Mehmet EMIN KORTAK

This research aimed at designing and improving the web-based integrated peer and self- assessment. WesPASS (web-based peer-assessment system), developed in this research, allows students to assess their own or their peers’ performance and project assignments and to report about the result of these assessments so that they correct their assignments. This study employed design-based research. The participants included 102 fourth grade primary school students and their 4 teachers from 2 state and 2 private primary schools in Ankara, Kecioren (Turkey) who employed the system and were engaged in a questionnaire survey to assess its quality. The findings were analyzed through quantitative data analysis. The findings revealed that the system can be used by elementary school students for peer and self-assessment system. The participants stated that WesPASS is simple and user-friendly, and it accelerates the assessment process by employing information technology and allows to share opinions 


2019 ◽  
Vol 14 (7) ◽  
pp. 621-627 ◽  
Author(s):  
Youhuang Bai ◽  
Xiaozhuan Dai ◽  
Tiantian Ye ◽  
Peijing Zhang ◽  
Xu Yan ◽  
...  

Background: Long noncoding RNAs (lncRNAs) are endogenous noncoding RNAs, arbitrarily longer than 200 nucleotides, that play critical roles in diverse biological processes. LncRNAs exist in different genomes ranging from animals to plants. Objective: PlncRNADB is a searchable database of lncRNA sequences and annotation in plants. Methods: We built a pipeline for lncRNA prediction in plants, providing a convenient utility for users to quickly distinguish potential noncoding RNAs from protein-coding transcripts. Results: More than five thousand lncRNAs are collected from four plant species (Arabidopsis thaliana, Arabidopsis lyrata, Populus trichocarpa and Zea mays) in PlncRNADB. Moreover, our database provides the relationship between lncRNAs and various RNA-binding proteins (RBPs), which can be displayed through a user-friendly web interface. Conclusion: PlncRNADB can serve as a reference database to investigate the lncRNAs and their interaction with RNA-binding proteins in plants. The PlncRNADB is freely available at http://bis.zju.edu.cn/PlncRNADB/.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1736
Author(s):  
Zengchong Yang ◽  
Xiucheng Liu ◽  
Bin Wu ◽  
Ren Liu

Previous studies on Lamb wave touchscreen (LWT) were carried out based on the assumption that the unknown touch had the consistent parameters with acoustic fingerprints in the reference database. The adaptability of LWT to the variations in touch force and touch area was investigated in this study for the first time. The automatic collection of the databases of acoustic fingerprints was realized with an experimental prototype of LWT employing three pairs of transmitter–receivers. The self-adaptive updated weight coefficient of the used transmitter–receiver pairs was employed to successfully improve the accuracy of the localization model established based on a learning method. The performance of the improved method in locating single- and two-touch actions with the reference database of different parameters was carefully evaluated. The robustness of the LWT to the variation of the touch force varied with the touch area. Moreover, it was feasible to locate touch actions of large area with reference databases of small touch areas as long as the unknown touch and the reference databases met the condition of equivalent averaged stress.


GigaScience ◽  
2021 ◽  
Vol 10 (2) ◽  
Author(s):  
Guilhem Sempéré ◽  
Adrien Pétel ◽  
Magsen Abbé ◽  
Pierre Lefeuvre ◽  
Philippe Roumagnac ◽  
...  

Abstract Background Efficiently managing large, heterogeneous data in a structured yet flexible way is a challenge to research laboratories working with genomic data. Specifically regarding both shotgun- and metabarcoding-based metagenomics, while online reference databases and user-friendly tools exist for running various types of analyses (e.g., Qiime, Mothur, Megan, IMG/VR, Anvi'o, Qiita, MetaVir), scientists lack comprehensive software for easily building scalable, searchable, online data repositories on which they can rely during their ongoing research. Results metaXplor is a scalable, distributable, fully web-interfaced application for managing, sharing, and exploring metagenomic data. Being based on a flexible NoSQL data model, it has few constraints regarding dataset contents and thus proves useful for handling outputs from both shotgun and metabarcoding techniques. By supporting incremental data feeding and providing means to combine filters on all imported fields, it allows for exhaustive content browsing, as well as rapid narrowing to find specific records. The application also features various interactive data visualization tools, ways to query contents by BLASTing external sequences, and an integrated pipeline to enrich assignments with phylogenetic placements. The project home page provides the URL of a live instance allowing users to test the system on public data. Conclusion metaXplor allows efficient management and exploration of metagenomic data. Its availability as a set of Docker containers, making it easy to deploy on academic servers, on the cloud, or even on personal computers, will facilitate its adoption.


Metabolites ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 113
Author(s):  
Julia Koblitz ◽  
Sabine Will ◽  
S. Riemer ◽  
Thomas Ulas ◽  
Meina Neumann-Schaal ◽  
...  

Genome-scale metabolic models are of high interest in a number of different research fields. Flux balance analysis (FBA) and other mathematical methods allow the prediction of the steady-state behavior of metabolic networks under different environmental conditions. However, many existing applications for flux optimizations do not provide a metabolite-centric view on fluxes. Metano is a standalone, open-source toolbox for the analysis and refinement of metabolic models. While flux distributions in metabolic networks are predominantly analyzed from a reaction-centric point of view, the Metano methods of split-ratio analysis and metabolite flux minimization also allow a metabolite-centric view on flux distributions. In addition, we present MMTB (Metano Modeling Toolbox), a web-based toolbox for metabolic modeling including a user-friendly interface to Metano methods. MMTB assists during bottom-up construction of metabolic models by integrating reaction and enzymatic annotation data from different databases. Furthermore, MMTB is especially designed for non-experienced users by providing an intuitive interface to the most commonly used modeling methods and offering novel visualizations. Additionally, MMTB allows users to upload their models, which can in turn be explored and analyzed by the community. We introduce MMTB by two use cases, involving a published model of Corynebacterium glutamicum and a newly created model of Phaeobacter inhibens.


2021 ◽  
pp. 193229682098557
Author(s):  
Alysha M. De Livera ◽  
Jonathan E. Shaw ◽  
Neale Cohen ◽  
Anne Reutens ◽  
Agus Salim

Motivation: Continuous glucose monitoring (CGM) systems are an essential part of novel technology in diabetes management and care. CGM studies have become increasingly popular among researchers, healthcare professionals, and people with diabetes due to the large amount of useful information that can be collected using CGM systems. The analysis of the data from these studies for research purposes, however, remains a challenge due to the characteristics and large volume of the data. Results: Currently, there are no publicly available interactive software applications that can perform statistical analyses and visualization of data from CGM studies. With the rapidly increasing popularity of CGM studies, such an application is becoming necessary for anyone who works with these large CGM datasets, in particular for those with little background in programming or statistics. CGMStatsAnalyser is a publicly available, user-friendly, web-based application, which can be used to interactively visualize, summarize, and statistically analyze voluminous and complex CGM datasets together with the subject characteristics with ease.


Sign in / Sign up

Export Citation Format

Share Document