General Unified Microbiome Profiling Pipeline (GUMPP) for Large Scale, Streamlined and Reproducible Analysis of Bacterial 16S rRNA Data to Predicted Microbial Metagenomes, Enzymatic Reactions and Metabolic Pathways

General Unified Microbiome Profiling Pipeline (GUMPP) was developed for large scale, streamlined and reproducible analysis of bacterial 16S rRNA data and prediction of microbial metagenomes, enzymatic reactions and metabolic pathways from amplicon data. GUMPP workflow introduces reproducible data analyses at each of the three levels of resolution (genus; operational taxonomic units (OTUs); amplicon sequence variants (ASVs)). The ability to support reproducible analyses enables production of datasets that ultimately identify the biochemical pathways characteristic of disease pathology. These datasets coupled to biostatistics and mathematical approaches of machine learning can play a significant role in extraction of truly significant and meaningful information from a wide set of 16S rRNA datasets. The adoption of GUMPP in the gut-microbiota related research enables focusing on the generation of novel biomarkers that can lead to the development of mechanistic hypotheses applicable to the development of novel therapies in personalized medicine.

Download Full-text

Impact of Helicobacter Pylori Infection on Duodenal Microbial Community Structure and Microbial Metabolic Pathways

10.21203/rs.3.rs-166718/v2 ◽

2021 ◽

Author(s):

Tadashi Maeda ◽

Hiroaki Zai ◽

Yuto Fukui ◽

Yoshifumi Kato ◽

Eri Kumade ◽

...

Keyword(s):

Helicobacter Pylori ◽

16S Rrna ◽

Metabolic Pathways ◽

Amplicon Sequencing ◽

16S Rrna Genes ◽

Rrna Genes ◽

Rrna Gene ◽

Operational Taxonomic Units ◽

H Pylori ◽

The Impact

Abstract Background The bioactivities of commensal duodenal microbiota greatly influence the biofunction of hosts. We investigated the role of Helicobacter pylori infection in extra-gastroduodenal diseases by determining the impact of H. pylori infection on the duodenal microbiota. We sequenced 16S rRNA genes in samples aspirated from the descending duodenum of 47 (male, 20; female, 27) individuals who were screened for gastric cancer. Samples were analysed using 16S rRNA gene amplicon sequencing, and the LEFSe and Kyoto Encyclopaedia of Genes and Genomes methods were used to determine whether the duodenal microflora and microbial biofunctions were affected using H. pylori infection. Results Thirteen and 34 participants tested positive and negative for H. pylori, respectively. We identified 1,404 bacterial operational taxonomic units from 23 phyla and 253 genera. H. pylori infection increased the relative mean abundance of Proteobacteria and Neisseria and decreased the abundance of the two other phyla (Actinobacteria and TM7) and nine genera (Rothia, TM7-3, Leptotrichia, Lachnospiraceae, Megasphaera, F16, Moryella, Filifactor, and Paludibacter). Microbiota features were significantly influenced in H. pylori-positive participants by 12 taxa mostly classified as Gammaproteobacteria. Microbial functional annotation revealed that H. pylori significantly affected 12 microbial metabolic pathways. Conclusions H. pylori disrupted normal bacterial communities in the duodenum and changed the biofunctions of commensal microbiota primarily by upregulating specific metabolic pathways. Such upregulation may be involved in the onset of diseases associated with H. pylori infection.

Download Full-text

Enzyme annotation for orphan and novel reactions using knowledge of substrate reactive sites

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1818877116 ◽

2019 ◽

Vol 116 (15) ◽

pp. 7298-7307 ◽

Cited By ~ 17

Author(s):

Noushin Hadadi ◽

Homa MohammadiPeyhani ◽

Ljubisa Miskovic ◽

Marianne Seijo ◽

Vassily Hatzimanikatis

Keyword(s):

Metabolic Pathways ◽

Large Scale ◽

Scale Validation ◽

Enzymatic Reaction ◽

Binding Pocket ◽

Enzymatic Reactions ◽

Accurate Identification ◽

Enzyme Binding ◽

Associated Sequences ◽

Related Enzyme

Thousands of biochemical reactions with characterized activities are “orphan,” meaning they cannot be assigned to a specific enzyme, leaving gaps in metabolic pathways. Novel reactions predicted by pathway-generation tools also lack associated sequences, limiting protein engineering applications. Associating orphan and novel reactions with known biochemistry and suggesting enzymes to catalyze them is a daunting problem. We propose the method BridgIT to identify candidate genes and catalyzing proteins for these reactions. This method introduces information about the enzyme binding pocket into reaction-similarity comparisons. BridgIT assesses the similarity of two reactions, one orphan and one well-characterized nonorphan reaction, using their substrate reactive sites, their surrounding structures, and the structures of the generated products to suggest enzymes that catalyze the most-similar nonorphan reactions as candidates for also catalyzing the orphan ones. We performed two large-scale validation studies to test BridgIT predictions against experimental biochemical evidence. For the 234 orphan reactions from the Kyoto Encyclopedia of Genes and Genomes (KEGG) 2011 (a comprehensive enzymatic-reaction database) that became nonorphan in KEGG 2018, BridgIT predicted the exact or a highly related enzyme for 211 of them. Moreover, for 334 of 379 novel reactions in 2014 that were later cataloged in KEGG 2018, BridgIT predicted the exact or highly similar enzymes. BridgIT requires knowledge about only four connecting bonds around the atoms of the reactive sites to correctly annotate proteins for 93% of analyzed enzymatic reactions. Increasing to seven connecting bonds allowed for the accurate identification of a sequence for nearly all known enzymatic reactions.

Download Full-text

Metabolic pathways inferred from a bacterial marker gene illuminate ecological changes across South Pacific frontal boundaries

Nature Communications ◽

10.1038/s41467-021-22409-4 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Eric J. Raes ◽

Kristen Karsh ◽

Swan L. S. Sow ◽

Martin Ostrowski ◽

Mark V. Brown ◽

...

Keyword(s):

16S Rrna ◽

Metabolic Pathways ◽

Low Cost ◽

Marker Gene ◽

South Pacific ◽

Rrna Gene ◽

South Pacific Ocean ◽

Bacterial Marker ◽

Gene 16S Rrna ◽

Gene Data

AbstractGlobal oceanographic monitoring initiatives originally measured abiotic essential ocean variables but are currently incorporating biological and metagenomic sampling programs. There is, however, a large knowledge gap on how to infer bacterial functions, the information sought by biogeochemists, ecologists, and modelers, from the bacterial taxonomic information (produced by bacterial marker gene surveys). Here, we provide a correlative understanding of how a bacterial marker gene (16S rRNA) can be used to infer latitudinal trends for metabolic pathways in global monitoring campaigns. From a transect spanning 7000 km in the South Pacific Ocean we infer ten metabolic pathways from 16S rRNA gene sequences and 11 corresponding metagenome samples, which relate to metabolic processes of primary productivity, temperature-regulated thermodynamic effects, coping strategies for nutrient limitation, energy metabolism, and organic matter degradation. This study demonstrates that low-cost, high-throughput bacterial marker gene data, can be used to infer shifts in the metabolic strategies at the community scale.

Download Full-text

Meta-Apo improves accuracy of 16S-amplicon-based prediction of microbiome function

BMC Genomics ◽

10.1186/s12864-020-07307-1 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Gongchao Jing ◽

Yufeng Zhang ◽

Wenzhi Cui ◽

Lu Liu ◽

Jian Xu ◽

...

Keyword(s):

16S Rrna ◽

Large Scale ◽

Low Cost ◽

Human Microbiome ◽

Amplicon Sequencing ◽

Training Sample ◽

Rrna Gene ◽

16S Amplicon Sequencing ◽

Cross Platform ◽

Functional Profiles

Abstract Background Due to their much lower costs in experiment and computation than metagenomic whole-genome sequencing (WGS), 16S rRNA gene amplicons have been widely used for predicting the functional profiles of microbiome, via software tools such as PICRUSt 2. However, due to the potential PCR bias and gene profile variation among phylogenetically related genomes, functional profiles predicted from 16S amplicons may deviate from WGS-derived ones, resulting in misleading results. Results Here we present Meta-Apo, which greatly reduces or even eliminates such deviation, thus deduces much more consistent diversity patterns between the two approaches. Tests of Meta-Apo on > 5000 16S-rRNA amplicon human microbiome samples from 4 body sites showed the deviation between the two strategies is significantly reduced by using only 15 WGS-amplicon training sample pairs. Moreover, Meta-Apo enables cross-platform functional comparison between WGS and amplicon samples, thus greatly improve 16S-based microbiome diagnosis, e.g. accuracy of gingivitis diagnosis via 16S-derived functional profiles was elevated from 65 to 95% by WGS-based classification. Therefore, with the low cost of 16S-amplicon sequencing, Meta-Apo can produce a reliable, high-resolution view of microbiome function equivalent to that offered by shotgun WGS. Conclusions This suggests that large-scale, function-oriented microbiome sequencing projects can probably benefit from the lower cost of 16S-amplicon strategy, without sacrificing the precision in functional reconstruction that otherwise requires WGS. An optimized C++ implementation of Meta-Apo is available on GitHub (https://github.com/qibebt-bioinfo/meta-apo) under a GNU GPL license. It takes the functional profiles of a few paired WGS:16S-amplicon samples as training, and outputs the calibrated functional profiles for the much larger number of 16S-amplicon samples.

Download Full-text

Comprehensive Plasma Metabolomic Profile of Patients with Advanced Neuroendocrine Tumors (NETs). Diagnostic and Biological Relevance

Cancers ◽

10.3390/cancers13112634 ◽

2021 ◽

Vol 13 (11) ◽

pp. 2634

Author(s):

Beatriz Soldevilla ◽

Angeles López-López ◽

Alberto Lens-Pardo ◽

Carlos Carretero-Puche ◽

Angeles Lopez-Gonzalvez ◽

...

Keyword(s):

Metabolic Pathways ◽

Metabolic Networks ◽

Tca Cycle ◽

Metabolomic Profile ◽

Diagnostic Potential ◽

The Tca Cycle ◽

Novel Biomarkers ◽

Potential Methods ◽

Clinical Potential ◽

First Time

Purpose: High-throughput “-omic” technologies have enabled the detailed analysis of metabolic networks in several cancers, but NETs have not been explored to date. We aim to assess the metabolomic profile of NET patients to understand metabolic deregulation in these tumors and identify novel biomarkers with clinical potential. Methods: Plasma samples from 77 NETs and 68 controls were profiled by GC−MS, CE−MS and LC−MS untargeted metabolomics. OPLS-DA was performed to evaluate metabolomic differences. Related pathways were explored using Metaboanalyst 4.0. Finally, ROC and OPLS-DA analyses were performed to select metabolites with biomarker potential. Results: We identified 155 differential compounds between NETs and controls. We have detected an increase of bile acids, sugars, oxidized lipids and oxidized products from arachidonic acid and a decrease of carnitine levels in NETs. MPA/MSEA identified 32 enriched metabolic pathways in NETs related with the TCA cycle and amino acid metabolism. Finally, OPLS-DA and ROC analysis revealed 48 metabolites with diagnostic potential. Conclusions: This study provides, for the first time, a comprehensive metabolic profile of NET patients and identifies a distinctive metabolic signature in plasma of potential clinical use. A reduced set of metabolites of high diagnostic accuracy has been identified. Additionally, new enriched metabolic pathways annotated may open innovative avenues of clinical research.

Download Full-text

Handling of spurious sequences affects the outcome of high-throughput 16S rRNA gene amplicon profiling

ISME Communications ◽

10.1038/s43705-021-00033-z ◽

2021 ◽

Vol 1 (1) ◽

Author(s):

Sandra Reitmeier ◽

Thomas C. A. Hitch ◽

Nicole Treichel ◽

Nikolaos Fikas ◽

Bela Hausmann ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Amplicon Sequencing ◽

Rrna Gene ◽

Diversity Analysis ◽

Careful Attention ◽

Operational Taxonomic Units ◽

Basic Concepts ◽

Gnotobiotic Mice ◽

Mock Communities

Abstract16S rRNA gene amplicon sequencing is a popular approach for studying microbiomes. However, some basic concepts have still not been investigated comprehensively. We studied the occurrence of spurious sequences using defined microbial communities based on data either from the literature or generated in three sequencing facilities and analyzed via both operational taxonomic units (OTUs) and amplicon sequence variants (ASVs) approaches. OTU clustering and singleton removal, a commonly used approach, delivered approximately 50% (mock communities) to 80% (gnotobiotic mice) spurious taxa. The fraction of spurious taxa was generally lower based on ASV analysis, but varied depending on the gene region targeted and the barcoding system used. A relative abundance of 0.25% was found as an effective threshold below which the analysis of spurious taxa can be prevented to a large extent in both OTU- and ASV-based analysis approaches. Using this cutoff improved the reproducibility of analysis, i.e., variation in richness estimates was reduced by 38% compared with singleton filtering using six human fecal samples across seven sequencing runs. Beta-diversity analysis of human fecal communities was markedly affected by both the filtering strategy and the type of phylogenetic distances used for comparison, highlighting the importance of carefully analyzing data before drawing conclusions on microbiome changes. In summary, handling of artifact sequences during bioinformatic processing of 16S rRNA gene amplicon data requires careful attention to avoid the generation of misleading findings. We propose the concept of effective richness to facilitate the comparison of alpha-diversity across studies.

Download Full-text

Variations in 16S rRNA-based microbiome profiling between pyrosequencing runs and between pyrosequencing facilities

The Journal of Microbiology ◽

10.1007/s12275-014-3443-3 ◽

2014 ◽

Vol 52 (5) ◽

pp. 355-365 ◽

Cited By ~ 20

Author(s):

Minseok Kim ◽

Zhongtang Yu

Keyword(s):

16S Rrna ◽

Microbiome Profiling

Download Full-text

Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities

Applied and Environmental Microbiology ◽

10.1128/aem.01541-09 ◽

2009 ◽

Vol 75 (23) ◽

pp. 7537-7541 ◽

Cited By ~ 11597

Author(s):

Patrick D. Schloss ◽

Sarah L. Westcott ◽

Thomas Ryabin ◽

Justine R. Hall ◽

Martin Hartmann ◽

...

Keyword(s):

16S Rrna ◽

Software Package ◽

Sequence Data ◽

Rrna Gene ◽

Sequencing Data ◽

Laptop Computer ◽

Operational Taxonomic Units ◽

Β Diversity ◽

Single Piece

ABSTRACT mothur aims to be a comprehensive software package that allows users to use a single piece of software to analyze community sequence data. It builds upon previous tools to provide a flexible and powerful software package for analyzing sequencing data. As a case study, we used mothur to trim, screen, and align sequences; calculate distances; assign sequences to operational taxonomic units; and describe the α and β diversity of eight marine samples previously characterized by pyrosequencing of 16S rRNA gene fragments. This analysis of more than 222,000 sequences was completed in less than 2 h with a laptop computer.

Download Full-text

Phoenix 2: A locally installable large-scale 16S rRNA gene sequence analysis pipeline with Web interface

Journal of Biotechnology ◽

10.1016/j.jbiotec.2013.07.004 ◽

2013 ◽

Vol 167 (4) ◽

pp. 393-403 ◽

Cited By ~ 44

Author(s):

Jung Soh ◽

Xiaoli Dong ◽

Sean M. Caffrey ◽

Gerrit Voordouw ◽

Christoph W. Sensen

Keyword(s):

Sequence Analysis ◽

16S Rrna ◽

16S Rrna Gene ◽

Large Scale ◽

Gene Sequence ◽

Rrna Gene ◽

Web Interface ◽

Rrna Gene Sequence ◽

Analysis Pipeline ◽

Gene Sequence Analysis

Download Full-text

The challenges and potential utility of phenotypic specimen-level phylogeny based on maximum parsimony

Earth and Environmental Science Transactions of the Royal Society of Edinburgh ◽

10.1017/s1755691018000877 ◽

2018 ◽

Vol 109 (1-2) ◽

pp. 301-323 ◽

Cited By ~ 2

Author(s):

Emanuel TSCHOPP ◽

Paul UPCHURCH

Keyword(s):

Phylogenetic Analysis ◽

Maximum Parsimony ◽

Large Scale ◽

Intraspecific Variability ◽

Species Level ◽

A Posteriori ◽

Operational Taxonomic Units ◽

Vertebrate Palaeontology ◽

Tree Topologies ◽

Parsimony Criterion

ABSTRACTSpecimen-level phylogenetic approaches are widely used in molecular biology for taxonomic and systematic purposes. However, they have been largely ignored in analyses based on morphological traits, where phylogeneticists mostly resort to species-level analyses. Recently, a number of specimen-level studies have been published in vertebrate palaeontology. These studies indicate that specimen-level phylogeny may be a very useful tool for systematic reassessments at low taxonomic levels. Herein, we review the challenges when working with individual organisms as operational taxonomic units in a palaeontological context, and propose guidelines of how best to perform a specimen-level phylogenetic analysis using the maximum parsimony criterion. Given that no single methodology appears to be perfectly suited to resolve relationships among individuals, and that different taxa probably require different approaches to assess their systematics, we advocate the use of a number of methodologies. In particular, we recommend the inclusion of as many specimens and characters as feasible, and the analysis of relationships using an extended implied weighting approach with different downweighting functions. Resulting polytomies should be explored using a posteriori pruning of unstable specimens, and conflicting tree topologies between different iterations of the analysis should be evaluated by a combination of support values such as jackknifing and symmetric resampling. Species delimitation should be consistent among the ingroup and based on a reproducible approach. Although time-consuming and methodologically challenging, specimen-level phylogenetic analysis is a highly useful tool to assess intraspecific variability and provide the basis for a more informed and accurate creation of species-level operational taxonomic units in large-scale systematic studies. It also has the potential to inform us about past speciation processes, morphological trait evolution, and their potential intrinsic and extrinsic drivers in pre-eminent detail.

Download Full-text