AbstractMotivationThroughout their lifespans, humans continually interact with the microbial world, including those organisms which live in and on the human body. Research in this domain has revealed the extensive links between the human-associated microbiota and health. In particular, the microbiota of the human gut plays essential roles in digestion, nutrient metabolism, immune maturation and homeostasis, neurological signaling, and endocrine regulation. Microbial interaction networks are frequently estimated from data and are an indispensable tool for representing and understanding the relationships among the microbes of a microbiota. In this high-dimensional setting, the zero-inflated and compositional data structure (subject to unit-sum constraint) pose challenges to the accurate estimation of microbial interaction networks.MethodWe propose the zero-inflated latent Ising (ZILI) model for microbial interaction network which assumes that the distribution of relative abundance of microbiota is determined by finite latent states. This assumption is partly supported by the existing findings in literature [20]. The ZILI model can circumvents the unit-sum constraint and alleviates the zero-inflation problem under given assumptions. As for the model selection of ZILI, a two-step algorithm is proposed. ZILI and two-step algorithm are evaluated through simulated data and subsequently applied in our investigation of an infant gut microbiome dataset from New Hampshire Birth Cohort Study. The results are compared with results from traditional Gaussian graphical model (GGM) and dichotomous Ising model (DIS).ResultsThrough the simulation studies, provided that the ZILI model is the true generative model for the data, it is shown that the two-step algorithm can estimate the graphical structure effectively and is robust to a range of alternative settings of the related factors. Both GGM and DIS can not achieve a satisfying performance in these settings. For the infant gut microbiome dataset, we use both ZILI and GGM to estimate microbial interaction network. The final estimated networks turn out to share a statistically significant overlap in which the ZILI and two-step algorithm tend to select the sparser network than those modeled by GGM. From the shared subnetwork, a hub taxon Lachnospiraceae is identified whose involvement in human disease development has been discovered recently in literature.AvailabilityThe data and programs involved in Section 4 and 5 are available on request from the correspondence [email protected] informationSupplementary materials are available at Bioinformatics