Microbe-ID: An open source toolbox for microbial genotyping and species identification

Development of tools to identify species, genotypes, or novel strains of invasive organisms is critical for monitoring emergence and implementing rapid response measures. Molecular markers, although critical to identifying species or genotypes, require bioinformatic tools for analysis. However, user-friendly analytical tools for fast identification are not readily available. To address this need, we created a web-based set of applications called Microbe-ID that allow for customizing a toolbox for rapid species identification and strain genotyping using any genetic markers of choice. Two components of Microbe-ID, named Sequence-ID and Genotype-ID, implement species and genotype identification, respectively. Sequence-ID allows identification of species by using BLAST to query sequences for any locus of interest against a custom reference sequence database. Genotype-ID allows placement of an unknown multilocus marker in either a minimum spanning network or dendrogram with bootstrap support from a user-created reference database. Microbe-ID can be used for identification of any organism based on nucleotide sequences or any molecular marker type and several examples are provided. We created a public website for demonstration purposes called Microbe-ID ( www.microbe-id.org ) and provided a working implementation for the genus Phytophthora ( www.phytophthora-id.org ). In Phytophthora-ID, the Sequence-ID application allows identification based on ITS or cox spacer sequences. Genotype-ID groups individuals into clonal lineages based on simple sequence repeat (SSR) markers for the two invasive plant pathogen species P. infestans and P. ramorum. All code is open source and available on github and CRAN. Instructions for installation and use are provided at https://github.com/grunwaldlab/Microbe-ID.

Download Full-text

Microbe-ID: An open source toolbox for microbial genotyping and species identification

10.7287/peerj.preprints.2005v1 ◽

2016 ◽

Author(s):

Javier F Tabima ◽

Sydney E Everhart ◽

Meredith M Larsen ◽

Alexandra J Weisberg ◽

Zhian N Kamvar ◽

...

Keyword(s):

Open Source ◽

Species Identification ◽

Invasive Plant ◽

Reference Sequence ◽

Bootstrap Support ◽

Reference Database ◽

Bioinformatic Tools ◽

Minimum Spanning Network ◽

User Friendly ◽

Analytical Tools

Download Full-text

Microbe-ID: an open source toolbox for microbial genotyping and species identification

PeerJ ◽

10.7717/peerj.2279 ◽

2016 ◽

Vol 4 ◽

pp. e2279 ◽

Cited By ~ 2

Author(s):

Javier F. Tabima ◽

Sydney E. Everhart ◽

Meredith M. Larsen ◽

Alexandra J. Weisberg ◽

Zhian N. Kamvar ◽

...

Keyword(s):

Open Source ◽

Species Identification ◽

Invasive Plant ◽

Reference Sequence ◽

Bootstrap Support ◽

Reference Database ◽

Bioinformatic Tools ◽

Link Type ◽

Minimum Spanning Network ◽

Analytical Tools

Development of tools to identify species, genotypes, or novel strains of invasive organisms is critical for monitoring emergence and implementing rapid response measures. Molecular markers, although critical to identifying species or genotypes, require bioinformatic tools for analysis. However, user-friendly analytical tools for fast identification are not readily available. To address this need, we created a web-based set of applications called Microbe-ID that allow for customizing a toolbox for rapid species identification and strain genotyping using any genetic markers of choice. Two components of Microbe-ID, named Sequence-ID and Genotype-ID, implement species and genotype identification, respectively. Sequence-ID allows identification of species by using BLAST to query sequences for any locus of interest against a custom reference sequence database. Genotype-ID allows placement of an unknown multilocus marker in either a minimum spanning network or dendrogram with bootstrap support from a user-created reference database. Microbe-ID can be used for identification of any organism based on nucleotide sequences or any molecular marker type and several examples are provided. We created a public website for demonstration purposes called Microbe-ID (microbe-id.org) and provided a working implementation for the genusPhytophthora(phytophthora-id.org). InPhytophthora-ID, the Sequence-ID application allows identification based on ITS orcoxspacer sequences. Genotype-ID groups individuals into clonal lineages based on simple sequence repeat (SSR) markers for the two invasive plant pathogen speciesP. infestansandP. ramorum. All code is open source and available on github and CRAN. Instructions for installation and use are provided athttps://github.com/grunwaldlab/Microbe-ID.

Download Full-text

Rapid, raw-read reference and identification (R4IDs): A flexible platform for rapid generic species ID using long-read sequencing technology

10.1101/281048 ◽

2018 ◽

Cited By ~ 2

Author(s):

Joe Parker ◽

Andrew Helmstetter ◽

James Crowe ◽

John Iacona ◽

Dion Devey ◽

...

Keyword(s):

Dna Sequencing ◽

Species Identification ◽

Sequence Data ◽

Vascular Plant ◽

Reference Sequence ◽

Read Length ◽

Reference Database ◽

Sequencing Technology ◽

Long Read ◽

Suitable Reference

AbstractThe versatility of the current DNA sequencing platforms and the development of portable, nanopore sequencers means that it has never been easier to collect genetic data for unknown sample ID. DNA barcoding and meta-barcoding have become increasingly popular and barcode databases continue to grow at an impressive rate. However, the number of canonical genome assemblies (reference or draft) that are publically available is relatively tiny, hindering the more widespread use of genome scale DNA sequencing technology for accurate species identification and discovery. Here, we show that rapid raw-read reference datasets, or R4IDs for short, generated in a matter of hours on the Oxford Nanopore MinION, can bridge this gap and accelerate the generation of useable reference sequence data. By exploiting the long read length of this technology, shotgun genomic sequencing of a small portion of an organism’s genome can act as a suitable reference database despite the low sequencing coverage. These R4IDs can then be used for accurate species identification with minimal amounts of re-sequencing effort (1000s of reads). We demonstrated the capabilities of this approach with six vascular plant species for which we created R4IDs in the laboratory and then re-sequenced, live at the Kew Science Festival 2016. We further validated our method using simulations to determine the broader applicability of the approach. Our data analysis pipeline has been made available as a Dockerised workflow for simple, scalable deployment for a range of uses.

Download Full-text

cpn60 barcode sequences accurately identify newly defined genera within the Lactobacillaceae

10.1101/2021.02.24.432354 ◽

2021 ◽

Author(s):

Ishika Shukla ◽

Janet E. Hill

Keyword(s):

Species Identification ◽

Sequence Data ◽

Sequence Diversity ◽

Reference Sequence ◽

Reference Database ◽

New Genera ◽

Accurate Identification ◽

Detection And Identification ◽

Taxonomic Framework ◽

Definition Of

AbstractThe cpn60 barcode sequence is established as an informative target for microbial species identification. Applications of cpn60 barcode sequencing are supported by the availability of “universal” PCR primers for its amplification and a curated reference database of cpn60 sequences, cpnDB. A recent reclassification of lactobacilli involving the definition of 23 new genera provided an opportunity to update cpnDB and to determine if the cpn60 barcode could be used for accurate identification of species consistent with the new framework. Analysis of 275 cpn60 sequences representing 258/269 of the validly named species in Lactobacillus, Paralactobacillus and the 23 newer genera showed that cpn60-based sequence relationships were consistent with the whole-genome-based phylogeny. Aligning or mapping full length barcode sequences or a 150 bp subsequence resulted in accurate and unambiguous species identification in almost all cases. Taken together, our results show that the combination of available reference sequence data, “universal” barcode amplification primers, and the inherent sequence diversity within the cpn60 barcode make it a useful target for the detection and identification of lactobacilli as defined by the latest taxonomic framework.Significance and Impact of the StudyThe genus Lactobacillus recently underwent a major reorganization resulting in the definition of 23 new genera. Lactobacilli are widespread in environmental and host-associated microbiomes and are exploited in food and biotechnology applications, making methods for their accurate identification desirable. Here we show that the combination of a reference sequence database, “universal” barcode amplification primers, and the inherent sequence diversity within the cpn60 barcode make it a useful target for the detection and identification of lactobacilli as defined by the latest taxonomic framework.

Download Full-text

PlncRNADB: A Repository of Plant lncRNAs and lncRNA-RBP Protein Interactions

Current Bioinformatics ◽

10.2174/1574893614666190131161002 ◽

2019 ◽

Vol 14 (7) ◽

pp. 621-627 ◽

Cited By ~ 3

Author(s):

Youhuang Bai ◽

Xiaozhuan Dai ◽

Tiantian Ye ◽

Peijing Zhang ◽

Xu Yan ◽

...

Keyword(s):

Protein Interactions ◽

Binding Proteins ◽

Rna Binding ◽

Rna Binding Proteins ◽

Populus Trichocarpa ◽

Noncoding Rnas ◽

Reference Database ◽

Protein Coding ◽

Arabidopsis Lyrata ◽

User Friendly

Background: Long noncoding RNAs (lncRNAs) are endogenous noncoding RNAs, arbitrarily longer than 200 nucleotides, that play critical roles in diverse biological processes. LncRNAs exist in different genomes ranging from animals to plants. Objective: PlncRNADB is a searchable database of lncRNA sequences and annotation in plants. Methods: We built a pipeline for lncRNA prediction in plants, providing a convenient utility for users to quickly distinguish potential noncoding RNAs from protein-coding transcripts. Results: More than five thousand lncRNAs are collected from four plant species (Arabidopsis thaliana, Arabidopsis lyrata, Populus trichocarpa and Zea mays) in PlncRNADB. Moreover, our database provides the relationship between lncRNAs and various RNA-binding proteins (RBPs), which can be displayed through a user-friendly web interface. Conclusion: PlncRNADB can serve as a reference database to investigate the lncRNAs and their interaction with RNA-binding proteins in plants. The PlncRNADB is freely available at http://bis.zju.edu.cn/PlncRNADB/.

Download Full-text

MackDroid - An Android based Application to monitor devices

Journal of Communications Technology Electronics and Computer Science ◽

10.22385/jctecs.v9i0.130 ◽

2016 ◽

Vol 9 ◽

pp. 1

Author(s):

Maaz Sirkhot ◽

Ekta Sirwani ◽

Aishwarya Kourani ◽

Akshit Batheja ◽

Kajal Jethanand Jewani

Keyword(s):

Operating System ◽

Open Source ◽

Vital Role ◽

Mobile Users ◽

Android Application ◽

Web Page ◽

Technological World ◽

Monitoring Service ◽

User Friendly

In this technological world, smartphones can be considered as one of the most far-reaching inventions. It plays a vital role in connecting people socially. The number of mobile users using an Android based smartphone has increased rapidly since last few years resulting in organizations, cyber cell departments, government authorities feeling the need to monitor the activities on certain targeted devices in order to maintain proper functionality of their respective jobs. Also with the advent of smartphones, Android became one of the most popular and widely used Operating System. Its highlighting features are that it is user friendly, smartly designed, flexible, highly customizable and supports latest technologies like IoT. One of the features that makes it exclusive is that it is based on Linux and is Open Source for all the developers. This is the reason why our project Mackdroid is an Android based application that collects data from the remote device, stores it and displays on a PHP based web page. It is primarily a monitoring service that analyzes the contents and distributes it in various categories like Call Logs, Chats, Key logs, etc. Our project aims at developing an Android application that can be used to track, monitor, store and grab data from the device and store it on a server which can be accessed by the handler of the application.

Download Full-text

Toward a global reference database of COI barcodes for marine zooplankton

Marine Biology ◽

10.1007/s00227-021-03887-y ◽

2021 ◽

Vol 168 (6) ◽

Author(s):

Ann Bucklin ◽

Katja T. C. A. Peijnenburg ◽

Ksenia N. Kosobokova ◽

Todd D. O’Brien ◽

Leocadio Blanco-Bercial ◽

...

Keyword(s):

Species Diversity ◽

Dna Sequences ◽

Reference Sequence ◽

Global Ocean ◽

Reference Database ◽

Data Repositories ◽

Marine Zooplankton ◽

The North ◽

Coi Sequences ◽

Taxonomic Groups

AbstractCharacterization of species diversity of zooplankton is key to understanding, assessing, and predicting the function and future of pelagic ecosystems throughout the global ocean. The marine zooplankton assemblage, including only metazoans, is highly diverse and taxonomically complex, with an estimated ~28,000 species of 41 major taxonomic groups. This review provides a comprehensive summary of DNA sequences for the barcode region of mitochondrial cytochrome oxidase I (COI) for identified specimens. The foundation of this summary is the MetaZooGene Barcode Atlas and Database (MZGdb), a new open-access data and metadata portal that is linked to NCBI GenBank and BOLD data repositories. The MZGdb provides enhanced quality control and tools for assembling COI reference sequence databases that are specific to selected taxonomic groups and/or ocean regions, with associated metadata (e.g., collection georeferencing, verification of species identification, molecular protocols), and tools for statistical analysis, mapping, and visualization. To date, over 150,000 COI sequences for ~ 5600 described species of marine metazoan plankton (including holo- and meroplankton) are available via the MZGdb portal. This review uses the MZGdb as a resource for summaries of COI barcode data and metadata for important taxonomic groups of marine zooplankton and selected regions, including the North Atlantic, Arctic, North Pacific, and Southern Oceans. The MZGdb is designed to provide a foundation for analysis of species diversity of marine zooplankton based on DNA barcoding and metabarcoding for assessment of marine ecosystems and rapid detection of the impacts of climate change.

Download Full-text

An Interactive WebGIS Framework for Coastal Erosion Risk Management

Journal of Marine Science and Engineering ◽

10.3390/jmse9060567 ◽

2021 ◽

Vol 9 (6) ◽

pp. 567

Author(s):

Alessandra Capolupo ◽

Cristina Monterisi ◽

Alessandra Saponieri ◽

Fabio Addona ◽

Leonardo Damiani ◽

...

Keyword(s):

Open Source ◽

3D Visualization ◽

Erosion Risk ◽

Innovative Strategies ◽

Web Mapping ◽

Natural Processes ◽

Multi Scale ◽

Interactive Interface ◽

Multi Temporal ◽

User Friendly

The Italian coastline stretches over about 8350 km, with 3600 km of beaches, representing a significant resource for the country. Natural processes and anthropic interventions keep threatening its morphology, moulding its shape and triggering soil erosion phenomena. Thus, many scholars have been focusing their work on investigating and monitoring shoreline instability. Outcomes of such activities can be largely widespread and shared with expert and non-expert users through Web mapping. This paper describes the performances of a WebGIS prototype designed to disseminate the results of the Italian project Innovative Strategies for the Monitoring and Analysis of Erosion Risk, known as the STIMARE project. While aiming to include the entire national coastline, three study areas along the regional coasts of Puglia and Emilia Romagna have already been implemented as pilot cases. This WebGIS was generated using Free and Open-Source Software for Geographic information systems (FOSS4G). The platform was designed by combining Apache http server, Geoserver, as open-source server and PostgreSQL (with PostGIS extension) as database. Pure javascript libraries OpenLayers and Cesium were implemented to obtain a hybrid 2D and 3D visualization. A user-friendly interactive interface was programmed to help users visualize and download geospatial data in several formats (pdf, kml and shp), in accordance with the European INSPIRE directives, satisfying both multi-temporal and multi-scale perspectives.

Download Full-text

Overcoming limitations to environmental DNA studies: A coastal temperate reference sequence database for multiple chloroplast gene regions generated in a single assay.

10.22541/au.163252330.05592688/v1 ◽

2021 ◽

Author(s):

Nicole Foster ◽

Kor-jent Dijk ◽

Ed Biffin ◽

Jennifer Young ◽

Vicki Thomson ◽

...

Keyword(s):

Dna Sequences ◽

Dna Barcode ◽

Environmental Dna ◽

Reference Sequence ◽

Reference Database ◽

Chloroplast Gene ◽

Coastal Plants ◽

Reference Databases ◽

Targeted Capture ◽

Comprehensive Reference

A proliferation in environmental DNA (eDNA) research has increased the reliance on reference sequence databases to assign unknown DNA sequences to known taxa. Without comprehensive reference databases, DNA extracted from environmental samples cannot be correctly assigned to taxa, limiting the use of this genetic information to identify organisms in unknown sample mixtures. For animals, standard metabarcoding practices involve amplification of the mitochondrial Cytochrome-c oxidase subunit 1 (CO1) region, which is a universally amplifyable region across majority of animal taxa. This region, however, does not work well as a DNA barcode for plants and fungi, and there is no similar universal single barcode locus that has the same species resolution. Therefore, generating reference sequences has been more difficult and several loci have been suggested to be used in parallel to get to species identification. For this reason, we developed a multi-gene targeted capture approach to generate reference DNA sequences for plant taxa across 20 target chloroplast gene regions in a single assay. We successfully compiled a reference database for 93 temperate coastal plants including seagrasses, mangroves, and saltmarshes/samphire’s. We demonstrate the importance of a comprehensive reference database to prevent species going undetected in eDNA studies. We also investigate how using multiple chloroplast gene regions impacts the ability to discriminate between taxa.

Download Full-text

Microvessel Chaste: An Open Library for Spatial Modelling of Vascularized Tissues

10.1101/105692 ◽

2017 ◽

Cited By ~ 2

Author(s):

J.A. Grogan ◽

A.J. Connor ◽

B. Markelc ◽

R.J. Muschel ◽

P.K. Maini ◽

...

Keyword(s):

Open Source ◽

Tumour Growth ◽

Spatial Models ◽

3D Models ◽

Tissue Growth ◽

Coronary Perfusion ◽

Analysis Model ◽

User Friendly

AbstractSpatial models of vascularized tissues are widely used in computational physiology, to study for example, tumour growth, angiogenesis, osteogenesis, coronary perfusion and oxygen delivery. Composition of such models is time-consuming, with many researchers writing custom software for this purpose. Recent advances in imaging have produced detailed three-dimensional (3D) datasets of vascularized tissues at the scale of individual cells. To fully exploit such data there is an increasing need for software that allows user-friendly composition of efficient, 3D models of vascularized tissue growth, and comparison of predictions with in vivo or in vitro experiments and other models. Microvessel Chaste is a new open-source library for building spatial models of vascularized tissue growth. It can be used to simulate vessel growth and adaptation in response to mechanical and chemical stimuli, intra- and extra-vascular transport of nutrient, growth factor and drugs, and cell proliferation in complex 3D geometries. The library provides a comprehensive Python interface to solvers implemented in C++, allowing user-friendly model composition, and integration with experimental data. Such integration is facilitated by interoperability with a growing collection of scientific Python software for image processing, statistical analysis, model annotation and visualization. The library is available under an open-source Berkeley Software Distribution (BSD) licence at https://jmsgrogan.github.io/MicrovesselChaste. This article links to two reproducible example problems, showing how the library can be used to model tumour growth and angiogenesis with realistic vessel networks.

Download Full-text