scholarly journals Adapting macroecology to microbiology: using occupancy modelling to assess functional profiles across metagenomes

2021 ◽  
Author(s):  
Angus S Hilts ◽  
Manjot S Hunjan ◽  
Laura A. Hug

Metagenomic sequencing provides information on the metabolic capacities and taxonomic affiliations for members of a microbial community. When assessing metabolic functions in a community, missing genes in pathways can occur in two ways: the genes may legitimately be missing from the community whose DNA was sequenced, or the genes were missed during shotgun sequencing or failed to assemble, and thus the metabolic capacity of interest is wrongly absent from the sequence data. Here, we borrow and adapt occupancy modelling from macroecology to provide mathematical context to metabolic predictions from metagenomes. We review the five assumptions underlying occupancy modelling through the lens of microbial community sequence data. Using the methane cycle, we apply occupancy modelling to examine the presence and absence of methanogenesis and methanotrophy genes from nearly 10,000 metagenomes spanning global environments. We determine that methanogenesis and methanotrophy are positively correlated across environments, and note that the lack of available standardized metadata for most metagenomes is a significant hindrance to large-scale statistical analyses. We present this adaptation of macroecology’s occupancy modelling to metagenomics as a tool for assessing presence/absence of traits in environmental microbiological surveys. We further initiate a call for stronger metadata standards to accompany metagenome deposition, to enable robust statistical approaches in the future.

2018 ◽  
Author(s):  
Lucas Czech ◽  
Alexandros Stamatakis

AbstractMotivationIn most metagenomic sequencing studies, the initial analysis step consists in assessing the evolutionary provenance of the sequences. Phylogenetic (or Evolutionary) Placement methods can be employed to determine the evolutionary position of sequences with respect to a given reference phylogeny. These placement methods do however face certain limitations: The manual selection of reference sequences is labor-intensive; the computational effort to infer reference phylogenies is substantially larger than for methods that rely on sequence similarity; the number of taxa in the reference phylogeny should be small enough to allow for visually inspecting the results.ResultsWe present algorithms to overcome the above limitations. First, we introduce a method to automatically construct representative sequences from databases to infer reference phylogenies. Second, we present an approach for conducting large-scale phylogenetic placements on nested phylogenies. Third, we describe a preprocessing pipeline that allows for handling huge sequence data sets. Our experiments on empirical data show that our methods substantially accelerate the workflow and yield highly accurate placement results.ImplementationFreely available under GPLv3 at http://github.com/lczech/[email protected] InformationSupplementary data are available at Bioinformatics online.


Author(s):  
Martin Steinegger ◽  
Steven L Salzberg

Metagenomic sequencing allows researchers to investigate organisms sampled from their native environments by sequencing their DNA directly, and then quantifying the abundance and taxonomic composition of the organisms thus captured. However, these types of analyses are sensitive to contamination in public databases caused by incorrectly labeled reference sequences. Here we describe Conterminator, an efficient method to detect and remove incorrectly labelled sequences by an exhaustive all-against-all sequence comparison. Our analysis reports contamination in 114,035 sequences and 2767 species in the NCBI Reference Sequence Database (RefSeq), 2,161,746 sequences and 6795 species in the GenBank database, and 14,132 protein sequences in the NR non-redundant protein database. Conterminator uncovers contamination in sequences spanning the whole range from draft genomes to “complete” model organism genomes. Our method, which scales linearly with input size, was able to process 3.3 terabytes of genomic sequence data in 12 days on a single 32-core compute node. We believe that Conterminator can become an important tool to ensure the quality of reference databases with particular importance for downstream metagenomic analyses. Source code (GPLv3): https://github.com/martin-steinegger/conterminator


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Alexander Eng ◽  
Adrian J. Verster ◽  
Elhanan Borenstein

Abstract Background Microbial communities have become an important subject of research across multiple disciplines in recent years. These communities are often examined via shotgun metagenomic sequencing, a technology which can offer unique insights into the genomic content of a microbial community. Functional annotation of shotgun metagenomic data has become an increasingly popular method for identifying the aggregate functional capacities encoded by the community’s constituent microbes. Currently available metagenomic functional annotation pipelines, however, suffer from several shortcomings, including limited pipeline customization options, lack of standard raw sequence data pre-processing, and insufficient capabilities for integration with distributed computing systems. Results Here we introduce MetaLAFFA, a functional annotation pipeline designed to take unfiltered shotgun metagenomic data as input and generate functional profiles. MetaLAFFA is implemented as a Snakemake pipeline, which enables convenient integration with distributed computing clusters, allowing users to take full advantage of available computing resources. Default pipeline settings allow new users to run MetaLAFFA according to common practices while a Python module-based configuration system provides advanced users with a flexible interface for pipeline customization. MetaLAFFA also generates summary statistics for each step in the pipeline so that users can better understand pre-processing and annotation quality. Conclusions MetaLAFFA is a new end-to-end metagenomic functional annotation pipeline with distributed computing compatibility and flexible customization options. MetaLAFFA source code is available at https://github.com/borenstein-lab/MetaLAFFA and can be installed via Conda as described in the accompanying documentation.


2019 ◽  
Vol 85 (15) ◽  
Author(s):  
Renxing Liang ◽  
Maggie Lau ◽  
Tatiana Vishnivetskaya ◽  
Karen G. Lloyd ◽  
Wei Wang ◽  
...  

ABSTRACTThe prevalence of microbial life in permafrost up to several million years (Ma) old has been well documented. However, the long-term survivability, evolution, and metabolic activity of the entombed microbes over this time span remain underexplored. We integrated aspartic acid (Asp) racemization assays with metagenomic sequencing to characterize the microbial activity, phylogenetic diversity, and metabolic functions of indigenous microbial communities across a ∼0.01- to 1.1-Ma chronosequence of continuously frozen permafrost from northeastern Siberia. Although Asp in the older bulk sediments (0.8 to 1.1 Ma) underwent severe racemization relative to that in the youngest sediment (∼0.01 Ma), the much lowerd-Asp/l-Asp ratio (0.05 to 0.14) in the separated cells from all samples suggested that indigenous microbial communities were viable and metabolically active in ancient permafrost up to 1.1 Ma. The microbial community in the youngest sediment was the most diverse and was dominated by the phylaActinobacteriaandProteobacteria. In contrast, microbial diversity decreased dramatically in the older sediments, and anaerobic, spore-forming bacteria withinFirmicutesbecame overwhelmingly dominant. In addition to the enrichment of sporulation-related genes, functional genes involved in anaerobic metabolic pathways such as fermentation, sulfate reduction, and methanogenesis were more abundant in the older sediments. Taken together, the predominance of spore-forming bacteria and associated anaerobic metabolism in the older sediments suggest that a subset of the original indigenous microbial community entrapped in the permafrost survived burial over geological time.IMPORTANCEUnderstanding the long-term survivability and associated metabolic traits of microorganisms in ancient permafrost frozen millions of years ago provides a unique window into the burial and preservation processes experienced in general by subsurface microorganisms in sedimentary deposits because of permafrost’s hydrological isolation and exceptional DNA preservation. We employed aspartic acid racemization modeling and metagenomics to determine which microbial communities were metabolically active in the 1.1-Ma permafrost from northeastern Siberia. The simultaneous sequencing of extracellular and intracellular genomic DNA provided insight into the metabolic potential distinguishing extinct from extant microorganisms under frozen conditions over this time interval. This in-depth metagenomic sequencing advances our understanding of the microbial diversity and metabolic functions of extant microbiomes from early Pleistocene permafrost. Therefore, these findings extend our knowledge of the survivability of microbes in permafrost from 33,000 years to 1.1 Ma.


2021 ◽  
Author(s):  
Cong Jiang ◽  
Wei Shui ◽  
Su-Feng Zhu ◽  
Jie Feng

Abstract Background: Karst tiankeng is a large-scale negative surface terrain, and slope aspect affect the soil conditions, vegetation and microbial flora in the tiankeng. However, the influence of the slope aspect on the soil microbial community in tiankeng has not been elucidated. Methods: In this study, metagenomic sequencing technology was used to analyzed the soil microbial communities and metabolic function on the shady and sunny slopes of karst tiankeng. Results: The Shannon-Wiener diversity of microbial communities on shady slopes was significantly higher than that on shady slopes. Shady and sunny slopes have similar microbial community composition, but there are differences in abundance. The linear discriminate analysis (LDA) results showed that biomarkers mainly belongs to Actinobacteria, Chloroflexi and Proteobacteria. Functional pathways and CAZy (Carbohydrate-Active Enzymes) genes also had a remarkable response to slope aspect change. LEfSe results indicated several biomarker pathways in sunny slope involved in human disease. Moreover, the abundance of CAZy genes was higher in shady slope and had stronger ability in decomposing litter. The microbial communities were mainly correlation with the vegetation characteristics (species richness and coverage) and soil properties (SOM and pH). Conclusions: These results indicate slope aspect has a pronounced influence on microbial community composition, structure and function at karst tiankeng. In the future, the conservation of karst tiankeng biodiversity should pay more attention to topographical factors.


2021 ◽  
Vol 12 (2) ◽  
pp. 1-22
Author(s):  
Jianguo Chen ◽  
Kenli Li ◽  
Keqin Li ◽  
Philip S. Yu ◽  
Zeng Zeng

Benefiting from convenient cycling and flexible parking locations, the Dockless Public Bicycle-sharing (DL-PBS) network becomes increasingly popular in many countries. However, redundant and low-utility stations waste public urban space and maintenance costs of DL-PBS vendors. In this article, we propose a Bicycle Station Dynamic Planning (BSDP) system to dynamically provide the optimal bicycle station layout for the DL-PBS network. The BSDP system contains four modules: bicycle drop-off location clustering, bicycle-station graph modeling, bicycle-station location prediction, and bicycle-station layout recommendation. In the bicycle drop-off location clustering module, candidate bicycle stations are clustered from each spatio-temporal subset of the large-scale cycling trajectory records. In the bicycle-station graph modeling module, a weighted digraph model is built based on the clustering results and inferior stations with low station revenue and utility are filtered. Then, graph models across time periods are combined to create a graph sequence model. In the bicycle-station location prediction module, the GGNN model is used to train the graph sequence data and dynamically predict bicycle stations in the next period. In the bicycle-station layout recommendation module, the predicted bicycle stations are fine-tuned according to the government urban management plan, which ensures that the recommended station layout is conducive to city management, vendor revenue, and user convenience. Experiments on actual DL-PBS networks verify the effectiveness, accuracy, and feasibility of the proposed BSDP system.


2021 ◽  
Vol 7 (2) ◽  
pp. 105
Author(s):  
Vinodhini Thiyagaraja ◽  
Robert Lücking ◽  
Damien Ertz ◽  
Samantha C. Karunarathna ◽  
Dhanushka N. Wanasinghe ◽  
...  

Ostropales sensu lato is a large group comprising both lichenized and non-lichenized fungi, with several lineages expressing optional lichenization where individuals of the same fungal species exhibit either saprotrophic or lichenized lifestyles depending on the substrate (bark or wood). Greatly variable phenotypic characteristics and large-scale phylogenies have led to frequent changes in the taxonomic circumscription of this order. Ostropales sensu lato is currently split into Graphidales, Gyalectales, Odontotrematales, Ostropales sensu stricto, and Thelenellales. Ostropales sensu stricto is now confined to the family Stictidaceae, which includes a large number of species that are poorly known, since they usually have small fruiting bodies that are rarely collected, and thus, their taxonomy remains partly unresolved. Here, we introduce a new genus Ostropomyces to accommodate a novel lineage related to Ostropa, which is composed of two new species, as well as a new species of Sphaeropezia, S. shangrilaensis. Maximum likelihood and Bayesian inference analyses of mitochondrial small subunit spacers (mtSSU), large subunit nuclear rDNA (LSU), and internal transcribed spacers (ITS) sequence data, together with phenotypic data documented by detailed morphological and anatomical analyses, support the taxonomic affinity of the new taxa in Stictidaceae. Ancestral character state analysis did not resolve the ancestral nutritional status of Stictidaceae with confidence using Bayes traits, but a saprotrophic ancestor was indicated as most likely in a Bayesian binary Markov Chain Monte Carlo sampling (MCMC) approach. Frequent switching in nutritional modes between lineages suggests that lifestyle transition played an important role in the evolution of this family.


AMB Express ◽  
2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Zhiyong Liu ◽  
Kai Dang ◽  
Cunzhi Li ◽  
Junhong Gao ◽  
Hong Wang ◽  
...  

Abstract Hexanitrohexaazaisowurtzitane (CL-20) is a compound with a polycyclic cage and an N-nitro group that has been shown to play an unfavorable role in environmental fate, biosafety, and physical health. The aim of this study was to isolate the microbial community and to identify a single microbial strain that can degrade CL-20 with desirable efficiency. Metagenomic sequencing methods were performed to investigate the dynamic changes in the composition of the community diversity. The most varied genus among the microbial community was Pseudomonas, which increased from 1.46% to 44.63% during the period of incubation (MC0–MC4). Furthermore, the new strain was isolated and identified from the activated sludge by bacterial morphological and 16s rRNA sequencing analyses. The CL-20 concentrations decreased by 75.21 μg/mL and 74.02 μg/mL in 48 h by MC4 and Pseudomonas sp. ZyL-01, respectively. Moreover, ZyL-01 could decompose 98% CL-20 of the real effluent in 14 day’s incubation with the glucose as carbon source. Finally, a draft genome sequence was obtained to predict possible degrading enzymes involved in the biodegradation of CL-20. Specifically, 330 genes that are involved in energy production and conversion were annotated by Gene Ontology functional enrichment analysis, and some of these candidates may encode enzymes that are responsible for CL-20 degradation. In summary, our studies indicate that microbes might be a valuable biological resource for the treatment of environmental contamination caused by CL-20 and that Pseudomonas sp. ZyL-01 might be a promising candidate for eradicating CL-20 to achieve a more biosafe environment and improve public health.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Gongchao Jing ◽  
Yufeng Zhang ◽  
Wenzhi Cui ◽  
Lu Liu ◽  
Jian Xu ◽  
...  

Abstract Background Due to their much lower costs in experiment and computation than metagenomic whole-genome sequencing (WGS), 16S rRNA gene amplicons have been widely used for predicting the functional profiles of microbiome, via software tools such as PICRUSt 2. However, due to the potential PCR bias and gene profile variation among phylogenetically related genomes, functional profiles predicted from 16S amplicons may deviate from WGS-derived ones, resulting in misleading results. Results Here we present Meta-Apo, which greatly reduces or even eliminates such deviation, thus deduces much more consistent diversity patterns between the two approaches. Tests of Meta-Apo on > 5000 16S-rRNA amplicon human microbiome samples from 4 body sites showed the deviation between the two strategies is significantly reduced by using only 15 WGS-amplicon training sample pairs. Moreover, Meta-Apo enables cross-platform functional comparison between WGS and amplicon samples, thus greatly improve 16S-based microbiome diagnosis, e.g. accuracy of gingivitis diagnosis via 16S-derived functional profiles was elevated from 65 to 95% by WGS-based classification. Therefore, with the low cost of 16S-amplicon sequencing, Meta-Apo can produce a reliable, high-resolution view of microbiome function equivalent to that offered by shotgun WGS. Conclusions This suggests that large-scale, function-oriented microbiome sequencing projects can probably benefit from the lower cost of 16S-amplicon strategy, without sacrificing the precision in functional reconstruction that otherwise requires WGS. An optimized C++ implementation of Meta-Apo is available on GitHub (https://github.com/qibebt-bioinfo/meta-apo) under a GNU GPL license. It takes the functional profiles of a few paired WGS:16S-amplicon samples as training, and outputs the calibrated functional profiles for the much larger number of 16S-amplicon samples.


Animals ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 865
Author(s):  
Lantian Su ◽  
Xinxin Liu ◽  
Guangyao Jin ◽  
Yue Ma ◽  
Haoxin Tan ◽  
...  

In recent decades, wild sable (Carnivora Mustelidae Martes zibellina) habitats, which are often natural forests, have been squeezed by anthropogenic disturbances such as clear-cutting, tilling and grazing. Sables tend to live in sloped areas with relatively harsh conditions. Here, we determine effects of environmental factors on wild sable gut microbial communities between high and low altitude habitats using Illumina Miseq sequencing of bacterial 16S rRNA genes. Our results showed that despite wild sable gut microbial community diversity being resilient to many environmental factors, community composition was sensitive to altitude. Wild sable gut microbial communities were dominated by Firmicutes (relative abundance 38.23%), followed by Actinobacteria (30.29%), and Proteobacteria (28.15%). Altitude was negatively correlated with the abundance of Firmicutes, suggesting sable likely consume more vegetarian food in lower habitats where plant diversity, temperature and vegetation coverage were greater. In addition, our functional genes prediction and qPCR results demonstrated that energy/fat processing microorganisms and functional genes are enriched with increasing altitude, which likely enhanced metabolic functions and supported wild sables to survive in elevated habitats. Overall, our results improve the knowledge of the ecological impact of habitat change, providing insights into wild animal protection at the mountain area with hash climate conditions.


Sign in / Sign up

Export Citation Format

Share Document