enviLink: A database linking contaminant biotransformation rules to enzyme classes in support of functional association mining

Mapping Intimacies ◽

10.1101/2021.05.20.442588 ◽

2021 ◽

Author(s):

Emanuel Schmid ◽

Kathrin Fenner

Keyword(s):

Microbial Communities ◽

De Novo ◽

Association Mining ◽

Chemical Contaminants ◽

Functional Association ◽

Transformation Rules ◽

Commercial Use ◽

Rdf Database ◽

Supporting Materials ◽

Support Research

Motivation: The ability to assess and engineer biotransformation of chemical contaminants present in the environment requires knowledge on which enzymes can catalyze specific contaminant biotransformation reactions. For the majority of over 100000 chemicals in commerce such knowledge is not available. Enumeration of enzyme classes potentially catalyzing observed or de novo predicted contaminant biotransformation reactions can support research that aims at experimentally uncovering enzymes involved in contaminant biotransformation in complex natural microbial communities. Database: enviLink is a new data module integrated into the enviPath database and contains 316 theoretically derived linkages between generalized biotransformation rules used for contaminant biotransformation prediction in enviPath and 3rd level EC classes. Rule-EC linkages have been derived using two reaction databases, i.e., Eawag-BBD in enviPath, focused on contaminant biotransformation reactions, and KEGG. 32.6% of identified rule-EC linkages overlap between the two databases, whereas 40.2% and 27.2%, respectively, are originating from Eawag-BBD and KEGG only. Implementation and availability: enviLink is encoded in RDF triples as part of the enviPath RDF database. enviPath is hosted on a public webserver (envipath.org) and all data is freely available for non-commercial use. enviLink can be searched online for individual transformation rules of interest (https://tinyurl.com/y63ath3k) and is also fully downloadable from the supporting materials (i.e., Jupyter notebook enviLink and tsv files provided through GitHub at https://github.com/emanuel-schmid/enviLink).

Download Full-text

Discovery of novel community-relevant small proteins in a simplified human intestinal microbiome

Microbiome ◽

10.1186/s40168-020-00981-z ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Hannes Petruschke ◽

Christian Schori ◽

Sebastian Canzler ◽

Sarah Riesbeck ◽

Anja Poehlein ◽

...

Keyword(s):

Microbial Communities ◽

Intestinal Microbiota ◽

De Novo ◽

Bacterial Species ◽

Intestinal Microbiome ◽

Single Strain ◽

Small Proteins ◽

Human Intestinal Microbiota ◽

Wide Range

Abstract Background The intestinal microbiota plays a crucial role in protecting the host from pathogenic microbes, modulating immunity and regulating metabolic processes. We studied the simplified human intestinal microbiota (SIHUMIx) consisting of eight bacterial species with a particular focus on the discovery of novel small proteins with less than 100 amino acids (= sProteins), some of which may contribute to shape the simplified human intestinal microbiota. Although sProteins carry out a wide range of important functions, they are still often missed in genome annotations, and little is known about their structure and function in individual microbes and especially in microbial communities. Results We created a multi-species integrated proteogenomics search database (iPtgxDB) to enable a comprehensive identification of novel sProteins. Six of the eight SIHUMIx species, for which no complete genomes were available, were sequenced and de novo assembled. Several proteomics approaches including two earlier optimized sProtein enrichment strategies were applied to specifically increase the chances for novel sProtein discovery. The search of tandem mass spectrometry (MS/MS) data against the multi-species iPtgxDB enabled the identification of 31 novel sProteins, of which the expression of 30 was supported by metatranscriptomics data. Using synthetic peptides, we were able to validate the expression of 25 novel sProteins. The comparison of sProtein expression in each single strain versus a multi-species community cultivation showed that six of these sProteins were only identified in the SIHUMIx community indicating a potentially important role of sProteins in the organization of microbial communities. Two of these novel sProteins have a potential antimicrobial function. Metabolic modelling revealed that a third sProtein is located in a genomic region encoding several enzymes relevant for the community metabolism within SIHUMIx. Conclusions We outline an integrated experimental and bioinformatics workflow for the discovery of novel sProteins in a simplified intestinal model system that can be generically applied to other microbial communities. The further analysis of novel sProteins uniquely expressed in the SIHUMIx multi-species community is expected to enable new insights into the role of sProteins on the functionality of bacterial communities such as those of the human intestinal tract.

Download Full-text

Oral Spirochetes Implicated in Dental Diseases Are Widespread in Normal Human Subjects and Carry Extremely Diverse Integron Gene Cassettes

Applied and Environmental Microbiology ◽

10.1128/aem.00564-12 ◽

2012 ◽

Vol 78 (15) ◽

pp. 5288-5296 ◽

Cited By ~ 17

Author(s):

Yu-Wei Wu ◽

Mina Rho ◽

Thomas G. Doak ◽

Yuzhen Ye

Keyword(s):

Microbial Communities ◽

De Novo ◽

Human Subjects ◽

Human Microbiome ◽

Metagenomic Data ◽

De Bruijn Graph ◽

Content Type ◽

Gene Cassettes ◽

Dental Diseases ◽

Normal Human

ABSTRACTThe NIH Human Microbiome Project (HMP) has produced several hundred metagenomic data sets, allowing studies of the many functional elements in human-associated microbial communities. Here, we survey the distribution of oral spirochetes implicated in dental diseases in normal human individuals, using recombination sites associated with the chromosomal integron inTreponemagenomes, taking advantage of the multiple copies of the integron recombination sites (repeats) in the genomes, and using a targeted assembly approach that we have developed. We find that integron-containingTreponemaspecies are present in ∼80% of the normal human subjects included in the HMP. Further, we are able tode novoassemble the integron gene cassettes using our constrained assembly approach, which employs a unique application of the de Bruijn graph assembly information; most of these cassette genes were not assembled in whole-metagenome assemblies and could not be identified by mapping sequencing reads onto the known referenceTreponemagenomes due to the dynamic nature of integron gene cassettes. Our study significantly enriches the gene pool known to be carried byTreponemachromosomal integrons, totaling 826 (598 97% nonredundant) genes. We characterize the functions of these gene cassettes: many of these genes have unknown functions. The integron gene cassette arrays found in the human microbiome are extraordinarily dynamic, with different microbial communities sharing only a small number of common genes.

Download Full-text

CAMISIM: Simulating metagenomes and microbial communities

10.1101/300970 ◽

2018 ◽

Cited By ~ 4

Author(s):

Adrian Fritz ◽

Peter Hofmann ◽

Stephan Majda ◽

Eik Dahms ◽

Johannes Dröge ◽

...

Keyword(s):

Microbial Communities ◽

De Novo ◽

Real Data ◽

Small Data ◽

Data Sets ◽

Sequencing Data ◽

Taxonomic Profiling ◽

Benchmark Data ◽

Sequencing Technologies ◽

Wide Range

Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to differences in laboratory protocols, replicate numbers, and sequencing technologies. Accordingly, to effectively assess the performance of metagenomic analysis software, a wide range of benchmark data sets are required. Here, we describe the CAMISIM microbial community and metagenome simulator. The software can model different microbial abundance profiles, multi-sample time series and differential abundance studies, includes real and simulated strain-level diversity, and generates second and third generation sequencing data from taxonomic profiles or de novo. Gold standards are created for sequence assembly, genome binning, taxonomic binning, and taxonomic profiling. CAMSIM generated the benchmark data sets of the first CAMI challenge. For two simulated multi-sample data sets of the human and mouse gut microbiomes we observed high functional congruence to the real data. As further applications, we investigated the effect of varying evolutionary genome divergence, sequencing depth, and read error profiles on two popular metagenome assemblers, MEGAHIT and metaSPAdes, on several thousand small data sets generated with CAMISIM. CAMISIM can simulate a wide variety of microbial communities and metagenome data sets together with truth standards for method evaluation. All data sets and the software are freely available at: https://github.com/CAMI-challenge/CAMISIM

Download Full-text

Elucidation of roles for vitamin B12 in regulation of folate, ubiquinone, and methionine metabolism

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1612360114 ◽

2017 ◽

Vol 114 (7) ◽

pp. E1205-E1214 ◽

Cited By ~ 37

Author(s):

Margaret F. Romine ◽

Dmitry A. Rodionov ◽

Yukari Maezato ◽

Lindsey N. Anderson ◽

Premchendar Nandhikonda ◽

...

Keyword(s):

Vitamin B12 ◽

Microbial Communities ◽

Binding Proteins ◽

De Novo ◽

Transcriptional Regulator ◽

Vitamin B ◽

Methionine Metabolism ◽

Community Metabolism ◽

Light Sensing ◽

Allosteric Effector

Only a small fraction of vitamin B12-requiring organisms are able to synthesize B12 de novo, making it a common commodity in microbial communities. Initially recognized as an enzyme cofactor of a few enzymes, recent studies have revealed additional B12-binding enzymes and regulatory roles for B12. Here we report the development and use of a B12-based chemical probe to identify B12-binding proteins in a nonphototrophic B12-producing bacterium. Two unexpected discoveries resulted from this study. First, we identified a light-sensing B12-binding transcriptional regulator and demonstrated that it controls folate and ubiquinone biosynthesis. Second, our probe captured proteins involved in folate, methionine, and ubiquinone metabolism, suggesting that it may play a role as an allosteric effector of these processes. These metabolic processes produce precursors for synthesis of DNA, RNA, and protein. Thereby, B12 likely modulates growth, and by limiting its availability to auxotrophs, B12-producing organisms may facilitate coordination of community metabolism.

Download Full-text

Corynebacterium Comparative Genomics Reveals a Role for Cobamide Sharing in the Skin Microbiome

10.1101/2020.12.02.407395 ◽

2020 ◽

Author(s):

Mary Hannah Swaney ◽

Lindsay R Kalan

Keyword(s):

Comparative Genomics ◽

Microbial Communities ◽

Community Dynamics ◽

De Novo ◽

Skin Barrier ◽

Skin Microbiome ◽

Bacterial Phyla ◽

Host Interactions ◽

Complex Interactions ◽

Associated Species

ABSTRACTThe human skin microbiome is a key player in human health, with diverse functions ranging from defense against pathogens to education of the immune system. Recent studies have begun unraveling the complex interactions within skin microbial communities, shedding light on the invaluable role that skin microorganisms have in maintaining a healthy skin barrier. While the Corynebacterium genus is a dominant taxon of the skin microbiome, relatively little is known how skin-associated Corynebacteria contribute to microbe-microbe and microbe-host interactions on the skin. Here, we performed a comparative genomics analysis of 71 Corynebacterium species from diverse ecosystems, which revealed functional differences between host- and environment-associated species. In particular, host-associated species were enriched for de novo biosynthesis of cobamides, which are a class of cofactor essential for metabolism in organisms across the tree of life but are produced by a limited number of prokaryotes. Because cobamides have been hypothesized to mediate community dynamics within microbial communities, we analyzed skin metagenomes for Corynebacterium cobamide producers, which revealed a positive correlation between cobamide producer abundance and microbiome diversity, a trait associated with skin health. We also provide the first metagenome-based assessment of cobamide biosynthesis and utilization in the skin microbiome, showing that both dominant and low abundant skin taxa encode for the de novo biosynthesis pathway and that cobamide-dependent enzymes are encoded by phylogenetically diverse taxa across the major bacterial phyla on the skin. Taken together, our results support a role for cobamide sharing within skin microbial communities, which we hypothesize mediates community dynamics.

Download Full-text

XSTREME: Comprehensive motif analysis of biological sequence datasets

10.1101/2021.09.02.458722 ◽

2021 ◽

Cited By ~ 1

Author(s):

Charles E. Grant ◽

Timothy L. Bailey

Keyword(s):

Motif Discovery ◽

De Novo ◽

Positional Distribution ◽

Enrichment Analysis ◽

Biological Sequence ◽

Motif Analysis ◽

Web Based ◽

Fully Integrated ◽

Commercial Use ◽

Motif Enrichment

AbstractXSTREME is a web-based tool for performing comprehensive motif discovery and analysis in DNA, RNA or protein sequences, as well as in sequences in user-defined alphabets. It is designed for both very large and very small datasets. XSTREME is similar to the MEME-ChIP tool, but expands upon its capabilities in several ways. Like MEME-ChIP, XSTREME performs two types of de novo motif discovery, and also performs motif enrichment analysis of the input sequences using databases of known motifs. Unlike MEME-ChIP, which ranks motifs based on their enrichment in the centers of the input sequences, XSTREME uses enrichment anywhere in the sequences for this purpose. Consequently, XSTREME is more appropriate for motif-based analysis of sequences regardless of how the motifs are distributed within the sequences. XSTREME uses the MEME and STREME algorithms for motif discovery, and the recently developed SEA algorithm for motif enrichment analysis. The interactive HTML output produced by XSTREME includes highly accurate motif significance estimates, plots of the positional distribution of each motif, and histograms of the number of motif matches in each sequences. XSTREME is easy to use via its web server at https://meme-suite.org, and is fully integrated with the widely-used MEME Suite of sequence analysis tools, which can be freely downloaded at the same web site for non-commercial use.

Download Full-text

MetaCoAG: Binning Metagenomic Contigs via Composition, Coverage and Assembly Graphs

10.1101/2021.09.10.459728 ◽

2021 ◽

Author(s):

Vijini Mallawaarachchi ◽

Yu Lin

Keyword(s):

Microbial Communities ◽

De Novo ◽

State Of The Art ◽

Genetic Material ◽

Single Copy ◽

Experimental Results ◽

Marker Genes ◽

High Quality ◽

Second Best ◽

Direct Use

ABSTRACTMetagenomics binning has allowed us to study and characterize various genetic material of different species and gain insights into microbial communities. While existing binning tools bin metagenomics de novo assemblies, they do not make use of the assembly graphs that produce such assemblies. Here we propose MetaCoAG, a tool that utilizes assembly graphs with the composition and coverage information to bin metagenomic contigs. MetaCoAG uses single-copy marker genes to estimate the number of initial bins, assigns contigs into bins iteratively and adjusts the number of bins dynamically throughout the binning process. Experimental results on simulated and real datasets demonstrate that MetaCoAG significantly outperforms state-of-the-art binning tools, producing more high-quality bins than the second-best tool, with an average median F1-score of 88.40%. To the best of our knowledge, MetaCoAG is the first stand-alone binning tool to make direct use of the assembly graph information. MetaCoAG is available at https://github.com/Vini2/MetaCoAG.

Download Full-text

De novo metagenomic assembly reveals abundant novel major lineage of Archaea in hypersaline microbial communities

The ISME Journal ◽

10.1038/ismej.2011.78 ◽

2011 ◽

Vol 6 (1) ◽

pp. 81-93 ◽

Cited By ~ 228

Author(s):

Priya Narasingarao ◽

Sheila Podell ◽

Juan A Ugalde ◽

Céline Brochier-Armanet ◽

Joanne B Emerson ◽

...

Keyword(s):

Microbial Communities ◽

De Novo ◽

Major Lineage ◽

Metagenomic Assembly

Download Full-text

1420 - NanoAmpli-Seq: A de novo protocol for amplicon sequencing from mixed microbial communities on the nanopore sequencing platform

10.26226/morressier.5b5199c3b1b87b000ecf0112 ◽

2018 ◽

Author(s):

Szymon Calus ◽

Umer Zeeshan Ijaz

Keyword(s):

Microbial Communities ◽

De Novo ◽

Amplicon Sequencing ◽

Nanopore Sequencing ◽

Sequencing Platform

Download Full-text

Therapy-Related Myeloid Neoplasms Following Treatment for Multiple Myeloma : A Single-Center Analysis

Blood ◽

10.1182/blood-2019-122598 ◽

2019 ◽

Vol 134 (Supplement_1) ◽

pp. 4261-4261

Author(s):

Amelie Boquoi ◽

Soraya Magdalena Banahan ◽

Judith Strapatsas ◽

David Lopez y Niedenhoff ◽

Guido Kobbe ◽

...

Keyword(s):

Multiple Myeloma ◽

Bone Marrow ◽

Complete Remission ◽

Median Time ◽

Median Survival ◽

Partial Remission ◽

Research Funding ◽

De Novo ◽

Remission Status ◽

Support Research

Introduction Myelodysplastic syndromes (MDS) and acute myeloid leukemia (AML) comprise late complications following mutagenic treatment. Limited data is available on the outcome of patients (pts) developing therapy-related MDS and AML (tMDS, tAML) after treatment for multiple myeloma (MM). Methods From 1976 to 2011, 3814 pts were entered into the Düsseldorf MDS registry. We identified 200 pts with tMDS or tAML. Of those, 41 pts had also been diagnosed with multiple myeloma (mm-MDS/AML). We compared these 41 pts to pts with de novo MDS (n=3614) and to pts with tMDS with other underlying diseases (n=159, 55 pts with other hematological diseases (34.5%), 93 with solid tumors (58.5%) and 11 with other diseases (7%)). Patient characteristics Median time between MM diagnosis and the onset of MDS was 5.5 years (range 0-28.5 years). Median age at the time of diagnosis of mm-MDS/AML was 67.8 years (range 32.5-84.6 years). Of all 41 mm-MDS pts, 13 developed AML (32%). Median time to progression from MDS to AML was 5 months (range 0.5-68 months). According to the WHO classification of 2016, there were 7 MDS-SLD, 10 MDS-MLD, 1 MDS-RS SLD, 13 MDS-RS MLD, 7 MDS-EB I, 2 EB-2, 1 MDS del(5q). 58% of mm-MDS pts had a complex karyotype, mostly affecting chromosomes 5 (22%) and 7 (17%), less often affected were chromosomes 17 (13%), 20 (13%) and 21 (13%). At MDS diagnosis, 11 MM pts were in complete remission (22%), 29 pts showed partial remission (58%), and 10 pts a stable disease (20%). 84.4% of pts with mm-MDS/AML had received conventional chemotherapy, mostly anthracyclines and alkylating agents. 94.4% had received melphalan. 15% of pts had received novel agents including immunomodulatory drugs and proteasome inhibitors. Results Both mm-MDS pts and tMDS pts were significantly younger than de novo MDS pts, however, there was no age difference between mm-MDS and tMDS (mm-MDS: mean 67.8 years, range 32-85, tMDS: mean 64.3 years, range 21-85, p<0.05, de novo MDS: 71,9 years, range 18-105; p<0.05). Both mm-MDS pts and de novo MDS pts showed significantly more males than females (mm-MDS 67% male versus 33% female, de novo MDS 57% versus 43%, p<0.05) while tMDS pts showed an equal ratio (48% versus 52%). When we compared risk group distribution according to IPSS-R we found significantly fewer mm-MDS pts to be in the lower risk categories (p<0.05 for both mm-MDS versus t-MDS and mm-MDS versus de novo MDS). Both mm-MDS and tMDS pts had a significantly worse karyotype when compared to de novo MDS (p<0.05). More cell lineages were affected in mm-MDS and tMDS pts than in de novo MDS (p<0.05). 50% of mm-MDS pts were pancytopenic versus 26% of de novo pts (p<0.05). Hemoglobin levels were significantly lower in mm-MDS and tMDS pts than de novo MDS pts (p<0.05). mm-MDS pts showed significantly higher blast counts in the bone marrow than all tMDS (p<0.05). Progression to AML occurred significantly more often in mm-MDS pts. At 12 months we discovered 12% of de novo MDS pts to have transformed to AML, 19% of tMDS and 24% of mm-MDS. At 36 months, 20% of de novo MDS pts had transformed to AML, 34% of tMDS and 39% of mm-MDS (p<0.05). When mm-MDS pts transformed to AML their survival was very poor, however, not significantly different compared to mm-MDS without AML transformation (7 months versus 11 months, p>0.05). Median survival of de novo MDS pts was 32 months (CI 29.940 - 34.192, range 1-345 months). In contrast, median overall survival of both mm-MDS and all other t-MDS was significantly shorter with 13 months in both groups (p<0.05, mm-MDS: CI 5.262 - 20.692, range 1-99 months; tMDS: CI 10,016 - 15,939, range 0-160 months). Myeloma remission status had no impact on survival: pts in complete remission showed a median survival of 6 months (95% CI, range 0 - 35 months), pts with partial remission 7 months (95% CI, range 5 - 9 months) (p>0.05). Conclusion Pts developing a myeloid neoplasm after treatment for multiple myeloma present with biological characteristics similar to those seen in pts with other tMDS. However, both clinical and molecular features are more severe with higher bone marrow blast counts, worse karyotypes, a more unfavourable IPSS-R score, and a significantly higher rate of transformation to AML. Yet despite a more aggressive phenotype, prognosis is equally poor and independent of myeloma remission status in mm-MDS/AML pts suggesting secondary myeloid neoplasia to govern the stem cell niche independent of previous disease or treatment. Disclosures Boquoi: Celgene: Other: Travel, Accommodation, Expenses; Janssen: Other: Travel, Accommodations, Expenses; BMS: Honoraria; Amgen: Honoraria, Other: Travel, Accommodations, Expenses. Kobbe:Takeda: Honoraria, Other: Travel support; Novartis: Honoraria, Other: Travel support; Medac: Honoraria, Other: Travel support; Jazz: Honoraria, Other: Travel support; Roche: Honoraria, Other: Travel support; MSD: Honoraria, Other: Travel support; Neovii: Honoraria, Other: Travel support; Abbvie: Honoraria, Other: Travel support; Pfizer: Honoraria, Other: Travel support; Biotest: Honoraria, Other: Travel support; Celgene: Honoraria, Other: Travel support, Research Funding; Amgen: Honoraria, Other: Travel support, Research Funding. Gattermann:Takeda: Research Funding; Novartis: Honoraria; Alexion: Research Funding. Germing:Amgen: Honoraria; Celgene: Honoraria, Research Funding; Novartis: Honoraria, Research Funding; Jazz Pharmaceuticals: Honoraria. Schroeder:Celgene Corporation: Consultancy, Honoraria, Research Funding. Fenk:Takeda: Honoraria; Janssen: Honoraria; BMS: Honoraria, Other: Travel, Accomodation, Expenses; Amgen: Honoraria; Celgene: Honoraria, Research Funding.

Download Full-text