scholarly journals Learning a mixture of microbial networks using minorization–maximization

2019 ◽  
Vol 35 (14) ◽  
pp. i23-i30 ◽  
Author(s):  
Sahar Tavakoli ◽  
Shibu Yooseph

Abstract Motivation The interactions among the constituent members of a microbial community play a major role in determining the overall behavior of the community and the abundance levels of its members. These interactions can be modeled using a network whose nodes represent microbial taxa and edges represent pairwise interactions. A microbial network is typically constructed from a sample-taxa count matrix that is obtained by sequencing multiple biological samples and identifying taxa counts. From large-scale microbiome studies, it is evident that microbial community compositions and interactions are impacted by environmental and/or host factors. Thus, it is not unreasonable to expect that a sample-taxa matrix generated as part of a large study involving multiple environmental or clinical parameters can be associated with more than one microbial network. However, to our knowledge, microbial network inference methods proposed thus far assume that the sample-taxa matrix is associated with a single network. Results We present a mixture model framework to address the scenario when the sample-taxa matrix is associated with K microbial networks. This count matrix is modeled using a mixture of K Multivariate Poisson Log-Normal distributions and parameters are estimated using a maximum likelihood framework. Our parameter estimation algorithm is based on the minorization–maximization principle combined with gradient ascent and block updates. Synthetic datasets were generated to assess the performance of our approach on absolute count data, compositional data and normalized data. We also addressed the recovery of sparse networks based on an l1-penalty model. Availability and implementation MixMPLN is implemented in R and is freely available at https://github.com/sahatava/MixMPLN. Supplementary information Supplementary data are available at Bioinformatics online.

2014 ◽  
Author(s):  
Jimin Song ◽  
Kevin C Chen

Recently, a wealth of epigenomic data has been generated by biochemical assays and next-generation sequencing (NGS) technologies. In particular, histone modification data generated by the ENCODE project and other large-scale projects show specific patterns associated with regulatory elements in the human genome.It is important to build a unified statistical model to decipher the patterns of multiple histone modifications in a cell type to annotate chromatin states such as transcription start sites, enhancers and transcribed regions rather than to map histone modifications individually to regulatory elements. Several genome-wide statistical models have been developed based on hidden Markov models (HMMs). These methods typically use the Expectation-Maximization (EM) algorithm to estimate the parameters of the model.Here we used spectral learning, a state-of-the-art parameter estimation algorithm in machine learning.We found that spectral learning plus a few (up to five) iterations of local optimization of the likelihood outperforms the standard EM algorithm.We also evaluated our software implementation called Spectacle on independent biological datasets and found that Spectacle annotated experimentally defined functional elements such as enhancers significantly better than a previous state-of-the-art method. Spectacle can be downloaded from https://github.com/jiminsong/Spectacle .


2019 ◽  
Vol 35 (14) ◽  
pp. i370-i378 ◽  
Author(s):  
Jiafan Zhu ◽  
Xinhao Liu ◽  
Huw A Ogilvie ◽  
Luay K Nakhleh

Abstract Motivation Reticulate evolutionary histories, such as those arising in the presence of hybridization, are best modeled as phylogenetic networks. Recently developed methods allow for statistical inference of phylogenetic networks while also accounting for other processes, such as incomplete lineage sorting. However, these methods can only handle a small number of loci from a handful of genomes. Results In this article, we introduce a novel two-step method for scalable inference of phylogenetic networks from the sequence alignments of multiple, unlinked loci. The method infers networks on subproblems and then merges them into a network on the full set of taxa. To reduce the number of trinets to infer, we formulate a Hitting Set version of the problem of finding a small number of subsets, and implement a simple heuristic to solve it. We studied their performance, in terms of both running time and accuracy, on simulated as well as on biological datasets. The two-step method accurately infers phylogenetic networks at a scale that is infeasible with existing methods. The results are a significant and promising step towards accurate, large-scale phylogenetic network inference. Availability and implementation We implemented the algorithms in the publicly available software package PhyloNet (https://bioinfocs.rice.edu/PhyloNet). Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Gokmen Altay

AbstractMotivation:Inferring large scale directional networks with higher accuracy has important applications such as gene regulatory network or finance.Results:We modified a well-established conservative causal core network inference algorithm, C3NET, to be able to infer very large scale networks with direction information. This advanced version is called Ac3net. We demonstrate that Ac3net outperforms C3NET and many other popular algorithms when considering the directional interaction information of gene/protein networks. We provide and R package and present performance results that are reproducible via the Supplementary file.Availability:Ac3net is available on CRAN and at github.com/altayg/Ac3netContact:[email protected] information:Supplementary file is available online.


Agriculture ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 561
Author(s):  
Enze Wang ◽  
Xiaolong Lin ◽  
Lei Tian ◽  
Xinguang Wang ◽  
Li Ji ◽  
...  

Rice straw is a byproduct of agricultural production and an important agricultural resource. However, rice straw has not yet been effectively used, and incorrect treatment methods (such as burning in the field) can cause serious damage to the environment. Studies have shown that straw returning is beneficial to soil, but there have been few studies focused on the effect of the amount of short-term straw returned on the soil microbial community. This study evaluates 0%, 50%, 75%, and 100% rice straw returned to the field on whether returning different amounts of straw in the short term would affect the diversity and composition of the soil microbial community and the correlation between bacteria and fungi. The results show that the amount of straw returned to the field is the main factor that triggers the changes in the abundance and composition of the microbial community in the paddy soil. A small amount of added straw (≤ 50% straw added) mainly affects the composition of the bacterial community, while a larger amount of added straw (> 50% straw added) mainly affects the composition of the fungal community. Returning a large amount of straw increases the microbial abundance related to carbon and iron cycles in the paddy soil, thus promoting the carbon and iron cycle processes to a certain extent. In addition, network analysis shows that returning a large amount of straw also increases the complexity of the microbial network, which may encourage more microbes to be niche-sharing and comprehensively improve the ecological environment of paddy soil. This study may provide some useful guidance for rice straw returning in northeast China.


Sign in / Sign up

Export Citation Format

Share Document