scholarly journals The Microbe Directory v2.0: An Expanded Database of Ecological and Phenotypical Features of Microbes

Author(s):  
Maria A. Sierra ◽  
Chandrima Bhattacharya ◽  
Krista Ryon ◽  
Sophie Meierovich ◽  
Heba Shaaban ◽  
...  

AbstractThe Microbe Directory (TMD) is a comprehensive database of annotations for microbial species collating features such as gram-stain, capsid-symmetry, resistance to antibiotics and more. This work presents a significant improvement to the original Microbe Directory (2018). This update adds 68,852 taxa, many new annotation features, an interface for the statistical analysis of microbiomes based on TMD features, and presents a portal for the broad community to add or correct entries. This update also adds curated lists of gene annotations which are useful for characterizing microbial genomes. Much of the new data in TMD is sourced from a set of databases and independent studies collating these data into a single quality controlled and curated source. This will allow researchers and clinicians to have easier access to microbial data and provide for the possibility of serendipitous discovery of otherwise unexpected trends.

2020 ◽  
Vol 49 (D1) ◽  
pp. D751-D763 ◽  
Author(s):  
I-Min A Chen ◽  
Ken Chu ◽  
Krishnaveni Palaniappan ◽  
Anna Ratner ◽  
Jinghua Huang ◽  
...  

Abstract The Integrated Microbial Genomes & Microbiomes system (IMG/M: https://img.jgi.doe.gov/m/) contains annotated isolate genome and metagenome datasets sequenced at the DOE’s Joint Genome Institute (JGI), submitted by external users, or imported from public sources such as NCBI. IMG v 6.0 includes advanced search functions and a new tool for statistical analysis of mixed sets of genomes and metagenome bins. The new IMG web user interface also has a new Help page with additional documentation and webinar tutorials to help users better understand how to use various IMG functions and tools for their research. New datasets have been processed with the prokaryotic annotation pipeline v.5, which includes extended protein family assignments.


2017 ◽  
Vol 15 (03) ◽  
pp. 1740001 ◽  
Author(s):  
Diem-Trang Pham ◽  
Shanshan Gao ◽  
Vinhthuy Phan

Determining abundances of microbial genomes in metagenomic samples is an important problem in analyzing metagenomic data. Although homology-based methods are popular, they have shown to be computationally expensive due to the alignment of tens of millions of reads from metagenomic samples to reference genomes of hundreds to thousands of environmental microbial species. We introduce an efficient alignment-free approach to estimate abundances of microbial genomes in metagenomic samples. The approach is based on solving linear and quadratic programs, which are represented by genome-specific markers (GSM). We compared our method against popular alignment-free and homology-based methods. Without contamination, our method was more accurate than other alignment-free methods while being much faster than a homology-based method. In more realistic settings where samples were contaminated with human DNA, our method was the most accurate method in predicting abundance at varying levels of contamination. We achieve higher accuracy than both alignment-free and homology-based methods.


2021 ◽  
Vol 1 ◽  
Author(s):  
Steven L. Salzberg ◽  
Derrick E. Wood

Ten years ago, the dramatic rise in the number of microbial genomes led to an inflection point, when the approach of finding short, exact matches in a comprehensive database became just as accurate as older, slower approaches. The new idea led to a method that was hundreds of times times faster than those that came before. Today, exact k-mer matching is a standard technique at the heart of many microbiome analysis tools.


2020 ◽  
Vol 36 (15) ◽  
pp. 4341-4344 ◽  
Author(s):  
Christopher J Neely ◽  
Elaina D Graham ◽  
Benjamin J Tully

Abstract Summary As the importance of microbiome research continues to become more prevalent and essential to understanding a wide variety of ecosystems (e.g. marine, built, host associated, etc.), there is a need for researchers to be able to perform highly reproducible and quality analysis of microbial genomes. MetaSanity incorporates analyses from 11 existing and widely used genome evaluation and annotation suites into a single, distributable workflow, thereby decreasing the workload of microbiologists by allowing for a flexible, expansive data analysis pipeline. MetaSanity has been designed to provide separate, reproducible workflows that (i) can determine the overall quality of a microbial genome, while providing a putative phylogenetic assignment, and (ii) can assign structural and functional gene annotations with varying degrees of specificity to suit the needs of the researcher. The software suite combines the results from several tools to provide broad insights into overall metabolic function. Importantly, this software provides built-in optimization for ‘big data’ analysis by storing all relevant outputs in an SQL database, allowing users to query all the results for the elements that will most impact their research. Availability and implementation MetaSanity is provided under the GNU General Public License v.3.0 and is available for download at https://github.com/cjneely10/MetaSanity. This application is distributed as a Docker image. MetaSanity is implemented in Python3/Cython and C++. Instructions for its installation and use are available within the GitHub wiki page at https://github.com/cjneely10/MetaSanity/wiki, and additional instructions are available at https://cjneely10.github.io/year-archive/. MetaSanity is optimized for users with limited programing experience. Supplementary information Supplementary data are available at Bioinformatics online.


mSystems ◽  
2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Victor Gambarini ◽  
Olga Pantos ◽  
Joanne M. Kingsbury ◽  
Louise Weaver ◽  
Kim M. Handley ◽  
...  

ABSTRACT The number of plastic-degrading microorganisms reported is rapidly increasing, making it possible to explore the conservation and distribution of presumed plastic-degrading traits across the diverse microbial tree of life. Putative degraders of conventional high-molecular-weight polymers, including polyamide, polystyrene, polyvinylchloride, and polypropylene, are spread widely across bacterial and fungal branches of the tree of life, although evidence for plastic degradation by a majority of these taxa appears limited. In contrast, we found strong degradation evidence for the synthetic polymer polylactic acid (PLA), and the microbial species related to its degradation are phylogenetically conserved among the bacterial family Pseudonocardiaceae. We collated data on genes and enzymes related to the degradation of all types of plastic to identify 16,170 putative plastic degradation orthologs by mining publicly available microbial genomes. The plastic with the largest number of putative orthologs, 10,969, was the natural polymer polyhydroxybutyrate (PHB), followed by the synthetic polymers polyethylene terephthalate (PET) and polycaprolactone (PCL), with 8,233 and 6,809 orthologs, respectively. These orthologous genes were discovered in the genomes of 6,000 microbial species, and most of them are as yet not identified as plastic degraders. Furthermore, all these species belong to 12 different microbial phyla, of which just 7 phyla have reported degraders to date. We have centralized information on reported plastic-degrading microorganisms within an interactive and updatable phylogenetic tree and database to confirm the global and phylogenetic diversity of putative plastic-degrading taxa and provide new insights into the evolution of microbial plastic-degrading capabilities and avenues for future discovery. IMPORTANCE We have collated the most complete database of microorganisms identified as being capable of degrading plastics to date. These data allow us to explore the phylogenetic distribution of these organisms and their enzymes, showing that traits for plastic degradation are predominantly not phylogenetically conserved. We found 16,170 putative plastic degradation orthologs in the genomes of 12 different phyla, which suggests a vast potential for the exploration of these traits in other taxa. Besides making the database available to the scientific community, we also created an interactive phylogenetic tree that can display all of the collated information, facilitating visualization and exploration of the data. Both the database and the tree are regularly updated to keep up with new scientific reports. We expect that our work will contribute to the field by increasing the understanding of the genetic diversity and evolution of microbial plastic-degrading traits.


2004 ◽  
Vol 18 (17n19) ◽  
pp. 2448-2454 ◽  
Author(s):  
T. Y. CHEN ◽  
L. C. HSIEH ◽  
C. H. CHANG ◽  
L. F. LUO ◽  
F. M. JI ◽  
...  

Statistical analysis of frequency occurrence of short words in complete genomes reveals the existence of a set of universal lengths common to all extant complete microbial genomes. This phenomenon is consistent with a model for genome growth in which primitive genomes grew mainly by maximally stochastic duplications of short segments from an initial length of about 200 nucleotides. The relevance of these results to the so-called RNA world in which life began and evolved before the rise of proteins is discussed.


2020 ◽  
Vol 12 (14) ◽  
pp. 5501
Author(s):  
Tamás Mátrai ◽  
János Tóth

The world population will reach 9.8 billion by 2050, with increased urbanization. Cycling is one of the fastest developing sustainable transport solutions. With the spread of public bike sharing (PBS) systems, it is very important to understand the differences between systems. This article focuses on the clustering of different bike sharing systems around the world. The lack of a comprehensive database about PBS systems in the world does not allow comparing or evaluating them. Therefore, the first step was to gather data about existing systems. The existing systems could be categorized by grouping criterions, and then typical models can be defined. Our assumption was that 90% of the systems could be classified into four clusters. We used clustering techniques and statistical analysis to create these clusters. However, our estimation proved to be too optimistic, therefore, we only used four distinct clusters (public, private, mixed, other) and the results were acceptable. The analysis of the different clusters and the identification of their common features is the next step of this line of research; however, some general characteristics of the proposed clusters are described. The result is a general method that could identify the type of a PBS system.


2019 ◽  
Author(s):  
Christopher J Neely ◽  
Elaina D Graham ◽  
Benjamin J Tully

AbstractSummaryAs the importance of microbiome research continues to become more prevalent and essential to understanding a wide variety of ecosystems (e.g., marine, built, host-associated, etc.), there is a need for researchers to be able to perform highly reproducible and quality analysis of microbial genomes. MetaSanity incorporates analyses from eleven existing and widely used genome evaluation and annotation suites into a single, distributable workflow, thereby decreasing the workload of microbiologists by allowing for a flexible, expansive data analysis pipeline. MetaSanity has been designed to provide separate, reproducible workflows, that (1) can determine the overall quality of a microbial genome, while providing a putative phylogenetic assignment, and (2) can assign structural and functional gene annotations with varying degrees of specificity to suit the needs of the researcher. The software suite combines the results from several tools to provide broad insights into overall metabolic function and putative extracellular localization of peptidases and carbohydrate-active enzymes. Importantly, this software provides built-in optimization for “big data” analysis by storing all relevant outputs in an SQL database, allowing users to query all the results for the elements that will most impact their research.Availability and implementationMetaSanity is provided under the GNU General Public License v.3.0 and is available for download at https://github.com/cjneely10/MetaSanity. This application is distributed as a Docker image. MetaSanity is implemented in Python3/Cython and C++.Supplementary informationSupplementary data are available below.


2009 ◽  
Vol 37 (Database) ◽  
pp. D479-D482 ◽  
Author(s):  
M. Pertea ◽  
K. Ayanbule ◽  
M. Smedinghoff ◽  
S. L. Salzberg

Gene ◽  
2008 ◽  
Vol 416 (1-2) ◽  
pp. 44-47 ◽  
Author(s):  
Yu-Hsiang Lin ◽  
Bill C.H. Chang ◽  
Pei-Wen Chiang ◽  
Sen-Lin Tang

Sign in / Sign up

Export Citation Format

Share Document