Phytool, a ShinyApp to homogenise taxonomy of freshwater microalgae from DNA barcodes and microscopic observations

Methods for biomonitoring of freshwater phytoplankton are evolving rapidly with eDNA-based methods, offering great complementarity with microscopy. Metabarcoding approaches have been more commonly used over the last years, with a continuous increase in the amount of data generated. Depending on the researchers and the way they assigned barcodes to species (bioinformatic pipelines and molecular reference databases), the taxonomic assignment obtained for HTS DNA reads might vary. This is also true for traditional taxonomic studies by microscopy with regular adjustments of the classification and taxonomy. For those reasons (leading to non-homogeneous taxonomies), gap-analyses and comparisons between studies become even more challenging and the curation processes to find potential consensus names are time-consuming. Here, we present a web-based application (Phytool), developed with ShinyApp (Rstudio), that aims to make the harmonisation of taxonomy easier and in a more efficient way, using a complete and up-to-date taxonomy reference database for freshwater microalgae. Phytool allows users to homogenise and update freshwater phytoplankton taxonomical names from sequence files and data tables directly uploaded in the application. It also gathers barcodes from curated references in a user-friendly way in which it is possible to search for specific organisms. All the data provided are downloadable with the possibility to apply filters in order to select only the required taxa and fields (e.g. specific taxonomic ranks). The main goal is to make accessible to a broad range of users the connection between microscopy and molecular biology and taxonomy through different ready-to-use functions. This study estimates that only 25% of species of freshwater phytoplankton in Phytobs are associated with a barcode. We plead for an increased effort to enrich reference databases by coupling taxonomy and molecular methods. Phytool should make this crucial work more efficient. The application is available at https://caninuzzo.shinyapps.io/phytool_v1/

Download Full-text

From DNA sequences to operational reference databases: an opinionated approach using R

ARPHA Conference Abstracts ◽

10.3897/aca.4.e64936 ◽

2021 ◽

Vol 4 ◽

Author(s):

François Keck ◽

Florian Altermatt

Keyword(s):

Dna Sequences ◽

Relational Databases ◽

Reference Database ◽

Complex Data ◽

Biological Sequence ◽

R Language ◽

Speed Up ◽

Reference Databases ◽

Taxonomic Groups ◽

User Friendly

Reference databases of sequences that have been taxonomically assigned are a key element for DNA-based identification of organisms. Accurate and complete reference databases are necessary to associate a correct taxonomic name to the sequences obtained in studies using metabarcoding. Today many research projects using DNA metabarcoding include the development of a custom reference database, often derived from large repositories like GenBank. At the same time, many projects are focussing on the development of ready-to-use databases validated by experts and targeting specific markers and taxonomic groups. While mainstream tools such as spreadsheet softwares may be suitable to manage small databases, they quickly become insufficient when the amount of data increases and validation operations become more complex. There is a clear need for providing user‐friendly and powerful tools to manipulate biological sequences and manage reference databases. The R language which is a free software and has already been adopted by many researchers to perform their analyses is highly suitable to develop such tools. In this talk, we will outline the approach we recommend to handle small- to middle-sized reference databases, currently still making the majority of projects. We will advocate that a simple tabular approach where each sequence constitutes an observation may be the most adequate. While such a single table may be less flexible and less optimized than relational databases or more complex data structures, it is easy to maintain and allows the direct use of modern dataframe centric tools. We will specifically present and discuss two R packages that can be used jointly to make reference database development more accessible and more reproducible. First, we will briefly introduce bioseq (Keck 2020) which is dedicated to biological sequence manipulation and analysis. The package implements classes and functions to make analyses of complex datasets including DNA, RNA or protein sequences as simple as possible. The strength of bioseq is to provide standard and more advanced functions to perform low level operations through a simple and consistent programming interface. Then we will present refdb, which has been developed as an environment for semi-automatic and assisted construction of reference databases. The refdb package is a reference database manager offering a set of powerful functions to import, organize, clean, filter, audit and export the data. We will outline how these two packages together can speed up reference database generation and handling, and contribute to standardization and repeatability in metabarcoding studies.

Download Full-text

Assessing accuracy and completeness of GenBank for eDNA metabarcoding: towards a reliable marine fish reference database

ARPHA Conference Abstracts ◽

10.3897/aca.4.e64671 ◽

2021 ◽

Vol 4 ◽

Author(s):

Cristina Claver ◽

Oriol Canals ◽

Naiara Rodriguez-Ezpeleta

Keyword(s):

Ribosomal Rna ◽

Gap Analysis ◽

Environmental Dna ◽

Reference Database ◽

Fish Diversity ◽

Cyt B ◽

Taxonomic Assignment ◽

Reference Databases ◽

Taxonomic Groups ◽

High Level

Environmental DNA (eDNA) metabarcoding, the process of sequencing DNA collected from the environment for producing biodiversity inventories, is increasingly being applied to assess fish diversity and distribution in marine environments. Yet, the successful application of this technique deeply relies on accurate and complete reference databases used for taxonomic assignment. The most used markers for fish eDNA metabarcoding studies are the cytochrome C oxidase subunit 1 (COI), 16S ribosomal RNA (16S), the 12S ribosomal RNA (12S) and cytochrome b (cyt b) genes, whose sequences are usually retrieved from GenBank, the largest DNA sequence database that represents a worldwide public resource for genetic studies. Thus, the completeness and accuracy of GenBank is critical to derive reliable estimations from fish eDNA metabarcoding data. Here, we have i) compiled the checklist of European marine fishes, ii) performed a gap analysis of the four genes and, within COI and 12S, also of the most used barcodes for fish, and iii) developed a workflow to detect potentially incorrect records in GenBank. We found that from the 1965 species in the checklist (1761 Actinopterygii, 189 Elasmobranchii, 9 Holocephali, 4 Petromyzonti and 2 Myxini), about 70% have sequences for COI, whereas less have sequences for 12S, 16S and cyt b (45-55%). Among the species for which COI ad 12S sequences are available, about 60% and 40% have sequences covering the most used barcodes respectively. The analysis of pairwise distances between sequences revealed pairs belonging to the same species with significantly low similarity and pairs belonging to different high level taxonomic groups (class, order) with significantly large similarity. In light of this further confirmation of presence of a substantial number of incorrect records in GenBank, we propose a method for identifying and removing spurious sequences to create reliable and accurate reference databases for eDNA metabarcoding.

Download Full-text

ensembleTax: an R package for determinations of ensemble taxonomic assignments of phylogenetically-informative marker gene sequences

PeerJ ◽

10.7717/peerj.11865 ◽

2021 ◽

Vol 9 ◽

pp. e11865

Author(s):

Dylan Catlett ◽

Kevin Son ◽

Connie Liang

Keyword(s):

Marker Gene ◽

Ensemble Methods ◽

R Package ◽

Reference Database ◽

Gene Sequences ◽

Taxonomic Assignment ◽

Informative Marker ◽

Data Set ◽

Reference Databases ◽

Taxonomic Assignments

Background High-throughput sequencing of phylogenetically informative marker genes is a widely used method to assess the diversity and composition of microbial communities. Taxonomic assignment of sampled marker gene sequences (referred to as amplicon sequence variants, or ASVs) imparts ecological significance to these genetic data. To assign taxonomy to an ASV, a taxonomic assignment algorithm compares the ASV to a collection of reference sequences (a reference database) with known taxonomic affiliations. However, many taxonomic assignment algorithms and reference databases are available, and the optimal algorithm and database for a particular scientific question is often unclear. Here, we present the ensembleTax R package, which provides an efficient framework for integrating taxonomic assignments predicted with any number of taxonomic assignment algorithms and reference databases to determine ensemble taxonomic assignments for ASVs. Methods The ensembleTax R package relies on two core algorithms: taxmapper and assign.ensembleTax. The taxmapper algorithm maps taxonomic assignments derived from one reference database onto the taxonomic nomenclature (a set of taxonomic naming and ranking conventions) of another reference database. The assign.ensembleTax algorithm computes ensemble taxonomic assignments for each ASV in a data set based on any number of taxonomic assignments determined with independent methods. Various parameters allow analysts to prioritize obtaining either more ASVs with more predicted clade names or more robust clade name predictions supported by multiple independent methods in ensemble taxonomic assignments. Results The ensembleTax R package is used to compute two sets of ensemble taxonomic assignments for a collection of protistan ASVs sampled from the coastal ocean. Comparisons of taxonomic assignments predicted by individual methods with those predicted by ensemble methods show that conservative implementations of the ensembleTax package minimize disagreements between taxonomic assignments predicted by individual and ensemble methods, but result in ASVs with fewer ranks assigned taxonomy. Less conservative implementations of the ensembleTax package result in an increased fraction of ASVs classified at all taxonomic ranks, but increase the number of ASVs for which ensemble assignments disagree with those predicted by individual methods. Discussion We discuss how implementation of the ensembleTax R package may be optimized to address specific scientific objectives based on the results of the application of the ensembleTax package to marine protist communities. While further work is required to evaluate the accuracy of ensemble taxonomic assignments relative to taxonomic assignments predicted by individual methods, we also discuss scenarios where ensemble methods are expected to improve the accuracy of taxonomy prediction for ASVs.

Download Full-text

The Role of Web Based Accounting Online System 2.0 as the Company's Income and Expense Management

Aptisi Transactions on Management (ATM) ◽

10.33050/atm.v1i1.655 ◽

2017 ◽

Vol 1 (1) ◽

pp. 44-49

Author(s):

Nur Azizah ◽

Dedeh Supriyanti ◽

Siti Fairuz Aminah Mustapha ◽

Holly Yang

Keyword(s):

Financial Management ◽

Accounting System ◽

Bank Account ◽

Web Based ◽

Online System ◽

Version 2.0 ◽

User Friendly ◽

The Impact ◽

A Company

In a company, the process of income and expense of money must have a profit-generating goal base. The success of financial management within the company, can be monitored from the ability of the financial management in managing the finances and utilize all the opportunities that exist with as much as possible with the aim to control the company's cash (cash flow) and the impact of generating profits in accordance with expectations. With a web-based online accounting system version 2.0, companies can be given the ease to manage money in and out of the company's cash. It has a user friendly system with navigation that makes it easy for the financial management to use it. Starting from the creation of a company's cash account used as a cash account and corporate bank account on the system, deletion or filing of cash accounts, up to the transfer invoice creation feature, receive and send money. Thus, this system is very effective and efficient in the management of income and corporate cash disbursements. Keywords:Accounting Online System, Financial Management, Cash and Bank

Download Full-text

Web-Based Peer and Self-Assessment System Design and Development for Elementary School Students

Journal of Education in Black Sea Region ◽

10.31578/jebs.v3i1.121 ◽

2018 ◽

Vol 3 (1) ◽

Author(s):

Mehmet EMIN KORTAK

Keyword(s):

Elementary School ◽

Elementary School Students ◽

Primary Schools ◽

Peer Assessment ◽

Assessment System ◽

Assessment Process ◽

School Students ◽

Self Assessment ◽

Web Based ◽

User Friendly

This research aimed at designing and improving the web-based integrated peer and self- assessment. WesPASS (web-based peer-assessment system), developed in this research, allows students to assess their own or their peers’ performance and project assignments and to report about the result of these assessments so that they correct their assignments. This study employed design-based research. The participants included 102 fourth grade primary school students and their 4 teachers from 2 state and 2 private primary schools in Ankara, Kecioren (Turkey) who employed the system and were engaged in a questionnaire survey to assess its quality. The findings were analyzed through quantitative data analysis. The findings revealed that the system can be used by elementary school students for peer and self-assessment system. The participants stated that WesPASS is simple and user-friendly, and it accelerates the assessment process by employing information technology and allows to share opinions

Download Full-text

PlncRNADB: A Repository of Plant lncRNAs and lncRNA-RBP Protein Interactions

Current Bioinformatics ◽

10.2174/1574893614666190131161002 ◽

2019 ◽

Vol 14 (7) ◽

pp. 621-627 ◽

Cited By ~ 3

Author(s):

Youhuang Bai ◽

Xiaozhuan Dai ◽

Tiantian Ye ◽

Peijing Zhang ◽

Xu Yan ◽

...

Keyword(s):

Protein Interactions ◽

Binding Proteins ◽

Rna Binding ◽

Rna Binding Proteins ◽

Populus Trichocarpa ◽

Noncoding Rnas ◽

Reference Database ◽

Protein Coding ◽

Arabidopsis Lyrata ◽

User Friendly

Background: Long noncoding RNAs (lncRNAs) are endogenous noncoding RNAs, arbitrarily longer than 200 nucleotides, that play critical roles in diverse biological processes. LncRNAs exist in different genomes ranging from animals to plants. Objective: PlncRNADB is a searchable database of lncRNA sequences and annotation in plants. Methods: We built a pipeline for lncRNA prediction in plants, providing a convenient utility for users to quickly distinguish potential noncoding RNAs from protein-coding transcripts. Results: More than five thousand lncRNAs are collected from four plant species (Arabidopsis thaliana, Arabidopsis lyrata, Populus trichocarpa and Zea mays) in PlncRNADB. Moreover, our database provides the relationship between lncRNAs and various RNA-binding proteins (RBPs), which can be displayed through a user-friendly web interface. Conclusion: PlncRNADB can serve as a reference database to investigate the lncRNAs and their interaction with RNA-binding proteins in plants. The PlncRNADB is freely available at http://bis.zju.edu.cn/PlncRNADB/.

Download Full-text

Adaptability of Ultrasonic Lamb Wave Touchscreen to the Variations in Touch Force and Touch Area

Sensors ◽

10.3390/s21051736 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1736

Author(s):

Zengchong Yang ◽

Xiucheng Liu ◽

Bin Wu ◽

Ren Liu

Keyword(s):

Lamb Wave ◽

Weight Coefficient ◽

The Self ◽

Reference Database ◽

Learning Method ◽

Improved Method ◽

Large Area ◽

Localization Model ◽

Reference Databases ◽

First Time

Previous studies on Lamb wave touchscreen (LWT) were carried out based on the assumption that the unknown touch had the consistent parameters with acoustic fingerprints in the reference database. The adaptability of LWT to the variations in touch force and touch area was investigated in this study for the first time. The automatic collection of the databases of acoustic fingerprints was realized with an experimental prototype of LWT employing three pairs of transmitter–receivers. The self-adaptive updated weight coefficient of the used transmitter–receiver pairs was employed to successfully improve the accuracy of the localization model established based on a learning method. The performance of the improved method in locating single- and two-touch actions with the reference database of different parameters was carefully evaluated. The robustness of the LWT to the variation of the touch force varied with the touch area. Moreover, it was feasible to locate touch actions of large area with reference databases of small touch areas as long as the unknown touch and the reference databases met the condition of equivalent averaged stress.

Download Full-text

metaXplor: an interactive viral and microbial metagenomic data manager

GigaScience ◽

10.1093/gigascience/giab001 ◽

2021 ◽

Vol 10 (2) ◽

Author(s):

Guilhem Sempéré ◽

Adrien Pétel ◽

Magsen Abbé ◽

Pierre Lefeuvre ◽

Philippe Roumagnac ◽

...

Keyword(s):

Heterogeneous Data ◽

Metagenomic Data ◽

Online Data ◽

Data Repositories ◽

Ongoing Research ◽

Efficient Management ◽

Public Data ◽

Reference Databases ◽

Interactive Data ◽

User Friendly

Abstract Background Efficiently managing large, heterogeneous data in a structured yet flexible way is a challenge to research laboratories working with genomic data. Specifically regarding both shotgun- and metabarcoding-based metagenomics, while online reference databases and user-friendly tools exist for running various types of analyses (e.g., Qiime, Mothur, Megan, IMG/VR, Anvi'o, Qiita, MetaVir), scientists lack comprehensive software for easily building scalable, searchable, online data repositories on which they can rely during their ongoing research. Results metaXplor is a scalable, distributable, fully web-interfaced application for managing, sharing, and exploring metagenomic data. Being based on a flexible NoSQL data model, it has few constraints regarding dataset contents and thus proves useful for handling outputs from both shotgun and metabarcoding techniques. By supporting incremental data feeding and providing means to combine filters on all imported fields, it allows for exhaustive content browsing, as well as rapid narrowing to find specific records. The application also features various interactive data visualization tools, ways to query contents by BLASTing external sequences, and an integrated pipeline to enrich assignments with phylogenetic placements. The project home page provides the URL of a live instance allowing users to test the system on public data. Conclusion metaXplor allows efficient management and exploration of metagenomic data. Its availability as a set of Docker containers, making it easy to deploy on academic servers, on the cloud, or even on personal computers, will facilitate its adoption.

Download Full-text

The Metano Modeling Toolbox MMTB: An Intuitive, Web-Based Toolbox Introduced by Two Use Cases

Metabolites ◽

10.3390/metabo11020113 ◽

2021 ◽

Vol 11 (2) ◽

pp. 113

Author(s):

Julia Koblitz ◽

Sabine Will ◽

S. Riemer ◽

Thomas Ulas ◽

Meina Neumann-Schaal ◽

...

Keyword(s):

Metabolic Networks ◽

Point Of View ◽

Use Cases ◽

Mathematical Methods ◽

State Behavior ◽

Web Based ◽

Research Fields ◽

Balance Analysis ◽

Metabolic Models ◽

User Friendly

Genome-scale metabolic models are of high interest in a number of different research fields. Flux balance analysis (FBA) and other mathematical methods allow the prediction of the steady-state behavior of metabolic networks under different environmental conditions. However, many existing applications for flux optimizations do not provide a metabolite-centric view on fluxes. Metano is a standalone, open-source toolbox for the analysis and refinement of metabolic models. While flux distributions in metabolic networks are predominantly analyzed from a reaction-centric point of view, the Metano methods of split-ratio analysis and metabolite flux minimization also allow a metabolite-centric view on flux distributions. In addition, we present MMTB (Metano Modeling Toolbox), a web-based toolbox for metabolic modeling including a user-friendly interface to Metano methods. MMTB assists during bottom-up construction of metabolic models by integrating reaction and enzymatic annotation data from different databases. Furthermore, MMTB is especially designed for non-experienced users by providing an intuitive interface to the most commonly used modeling methods and offering novel visualizations. Additionally, MMTB allows users to upload their models, which can in turn be explored and analyzed by the community. We introduce MMTB by two use cases, involving a published model of Corynebacterium glutamicum and a newly created model of Phaeobacter inhibens.

Download Full-text

An Interactive Web Application for the Statistical Analysis of Continuous Glucose Monitoring Data in Epidemiological Studies

Journal of Diabetes Science and Technology ◽

10.1177/1932296820985570 ◽

2021 ◽

pp. 193229682098557

Author(s):

Alysha M. De Livera ◽

Jonathan E. Shaw ◽

Neale Cohen ◽

Anne Reutens ◽

Agus Salim

Keyword(s):

Continuous Glucose Monitoring ◽

Web Application ◽

Diabetes Management ◽

Glucose Monitoring ◽

Epidemiological Studies ◽

Interactive Software ◽

Web Based ◽

Software Applications ◽

Visualization Of Data ◽

User Friendly

Motivation: Continuous glucose monitoring (CGM) systems are an essential part of novel technology in diabetes management and care. CGM studies have become increasingly popular among researchers, healthcare professionals, and people with diabetes due to the large amount of useful information that can be collected using CGM systems. The analysis of the data from these studies for research purposes, however, remains a challenge due to the characteristics and large volume of the data. Results: Currently, there are no publicly available interactive software applications that can perform statistical analyses and visualization of data from CGM studies. With the rapidly increasing popularity of CGM studies, such an application is becoming necessary for anyone who works with these large CGM datasets, in particular for those with little background in programming or statistics. CGMStatsAnalyser is a publicly available, user-friendly, web-based application, which can be used to interactively visualize, summarize, and statistically analyze voluminous and complex CGM datasets together with the subject characteristics with ease.

Download Full-text