Improved High Dimensional Discrete Bayesian Network Inference using Triplet Region Construction

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.12198 ◽

2020 ◽

Vol 69 ◽

pp. 231-295

Author(s):

Peng Lin ◽

Martin Neil ◽

Norman Fenton

Keyword(s):

Network Inference ◽

General Purpose ◽

Space Complexity ◽

High Dimensional ◽

Choice Problem ◽

Worst Case ◽

Exact Inference ◽

Tree Width ◽

Inference Methods ◽

Bayesian Network Inference

Performing efficient inference on high dimensional discrete Bayesian Networks (BNs) is challenging. When using exact inference methods the space complexity can grow exponentially with the tree-width, thus making computation intractable. This paper presents a general purpose approximate inference algorithm, based on a new region belief approximation method, called Triplet Region Construction (TRC). TRC reduces the cluster space complexity for factorized models from worst-case exponential to polynomial by performing graph factorization and producing clusters of limited size. Unlike previous generations of region-based algorithms, TRC is guaranteed to converge and effectively addresses the region choice problem that bedevils other region-based algorithms used for BN inference. Our experiments demonstrate that it also achieves significantly more accurate results than competing algorithms.

Download Full-text

Learning causal biological networks with the principle of Mendelian randomization

10.1101/171348 ◽

2017 ◽

Cited By ~ 1

Author(s):

Md. Bahadur Badsha ◽

Audrey Qiuyan Fu

Keyword(s):

Biological Networks ◽

Network Inference ◽

Mendelian Randomization ◽

Learning Algorithm ◽

General Purpose ◽

Molecular Phenotype ◽

Phenotype Data ◽

Causal Graphs ◽

Molecular Phenotypes ◽

Inference Methods

AbstractAlthough large amounts of genomic data are available, it remains a challenge to reliably infer causal (i.e., regulatory) relationships among molecular phenotypes (such as gene expression), especially when many phenotypes are involved. We extend the interpretation of the Principle of Mendelian randomization (PMR) and present MRPC, a novel machine learning algorithm that incorporates the PMR in classical algorithms for learning causal graphs in computer science. MRPC learns a causal biological network efficiently and robustly from integrating genotype and molecular phenotype data, in which directed edges indicate causal directions. We demonstrate through simulation that MRPC outperforms existing general-purpose network inference methods and other PMR-based methods. We apply MRPC to distinguish direct and indirect targets among multiple genes associated with expression quantitative trait loci.

Download Full-text

High-Dimensional Bayesian Network Inference From Systems Genetics Data Using Genetic Node Ordering

Frontiers in Genetics ◽

10.3389/fgene.2019.01196 ◽

2019 ◽

Vol 10 ◽

Author(s):

Lingfei Wang ◽

Pieter Audenaert ◽

Tom Michoel

Keyword(s):

Bayesian Network ◽

Network Inference ◽

High Dimensional ◽

Systems Genetics ◽

Bayesian Network Inference

Download Full-text

Inference and reasoning in a Bayesian knowledge-intensive CBR system

Progress in Artificial Intelligence ◽

10.1007/s13748-020-00223-1 ◽

2021 ◽

Author(s):

Hoda Nikpour ◽

Agnar Aamodt

Keyword(s):

Network Inference ◽

Semantic Network ◽

Case Based Reasoning ◽

Oil Well ◽

Well Drilling ◽

Knowledge Intensive ◽

Inference Methods ◽

Bayesian Network Inference ◽

Weighted Error ◽

Statistical Metrics

AbstractThis paper presents the inference and reasoning methods in a Bayesian supported knowledge-intensive case-based reasoning (CBR) system called BNCreek. The inference and reasoning process in this system is a combination of three methods. The semantic network inference methods and the CBR method are employed to handle the difficulties of inferencing and reasoning in uncertain domains. The Bayesian network inference methods are employed to make the process more accurate. An experiment from oil well drilling as a complex and uncertain application domain is conducted. The system is evaluated against expert estimations and compared with seven other corresponding systems. The normalized discounted cumulative gain (NDCG) as a rank-based metric, the weighted error (WE), and root-square error (RSE) as the statistical metrics are employed to evaluate different aspects of the system capabilities. The results show the efficiency of the developed inference and reasoning methods.

Download Full-text

High-dimensional Bayesian network inference from systems genetics data using genetic node ordering

10.1101/501460 ◽

2018 ◽

Author(s):

Lingfei Wang ◽

Pieter Audenaert ◽

Tom Michoel

Keyword(s):

Genetic Variation ◽

Bayesian Network ◽

Gene Regulatory Networks ◽

Gene Networks ◽

Regulatory Networks ◽

Network Inference ◽

High Dimensional ◽

Systems Genetics ◽

Gene Regulatory ◽

Bayesian Network Inference

AbstractStudying the impact of genetic variation on gene regulatory networks is essential to understand the biological mechanisms by which genetic variation causes variation in phenotypes. Bayesian networks provide an elegant statistical approach for multi-trait genetic mapping and modelling causal trait relationships. However, inferring Bayesian gene networks from high-dimensional genetics and genomics data is challenging, because the number of possible networks scales super-exponentially with the number of nodes, and the computational cost of conventional Bayesian network inference methods quickly becomes prohibitive. We propose an alternative method to infer high-quality Bayesian gene networks that easily scales to thousands of genes. Our method first reconstructs a node ordering by conducting pairwise causal inference tests between genes, which then allows to infer a Bayesian network via a series of independent variable selection problems, one for each gene. We demonstrate using simulated and real systems genetics data that this results in a Bayesian network with equal, and sometimes better, likelihood than the conventional methods, while having a significantly higher over-lap with groundtruth networks and being orders of magnitude faster. Moreover our method allows for a unified false discovery rate control across genes and individual edges, and thus a rigorous and easily interpretable way for tuning the sparsity level of the inferred network. Bayesian network inference using pairwise node ordering is a highly efficient approach for reconstructing gene regulatory networks when prior information for the inclusion of edges exists or can be inferred from the available data.

Download Full-text

ECBN: Ensemble Clustering based on Bayesian Network inference for Single-cell RNA-seq Data

2020 39th Chinese Control Conference (CCC) ◽

10.23919/ccc50068.2020.9188589 ◽

2020 ◽

Author(s):

Dexin Zhang ◽

Yuan Zhu

Keyword(s):

Bayesian Network ◽

Single Cell ◽

Network Inference ◽

Ensemble Clustering ◽

Rna Seq ◽

Bayesian Network Inference

Download Full-text

ModularBoost: an efficient network inference algorithm based on module decomposition

BMC Bioinformatics ◽

10.1186/s12859-021-04074-y ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Xinyu Li ◽

Wei Zhang ◽

Jianming Zhang ◽

Guang Li

Keyword(s):

Network Inference ◽

Detection Methods ◽

Inference Problem ◽

Topological Constraints ◽

Inference Algorithms ◽

Module Detection ◽

Series Expression ◽

Gene Modules ◽

Inference Methods ◽

Complicated Task

Abstract Background Given expression data, gene regulatory network(GRN) inference approaches try to determine regulatory relations. However, current inference methods ignore the inherent topological characters of GRN to some extent, leading to structures that lack clear biological explanation. To increase the biophysical meanings of inferred networks, this study performed data-driven module detection before network inference. Gene modules were identified by decomposition-based methods. Results ICA-decomposition based module detection methods have been used to detect functional modules directly from transcriptomic data. Experiments about time-series expression, curated and scRNA-seq datasets suggested that the advantages of the proposed ModularBoost method over established methods, especially in the efficiency and accuracy. For scRNA-seq datasets, the ModularBoost method outperformed other candidate inference algorithms. Conclusions As a complicated task, GRN inference can be decomposed into several tasks of reduced complexity. Using identified gene modules as topological constraints, the initial inference problem can be accomplished by inferring intra-modular and inter-modular interactions respectively. Experimental outcomes suggest that the proposed ModularBoost method can improve the accuracy and efficiency of inference algorithms by introducing topological constraints.

Download Full-text

Bayesian network inference using marginal trees

International Journal of Approximate Reasoning ◽

10.1016/j.ijar.2015.07.006 ◽

2016 ◽

Vol 68 ◽

pp. 127-152 ◽

Cited By ~ 3

Author(s):

Cory J. Butz ◽

Jhonatan S. Oliveira ◽

Anders L. Madsen

Keyword(s):

Bayesian Network ◽

Network Inference ◽

Bayesian Network Inference

Download Full-text

Evaluating the reproducibility of single-cell gene regulatory network inference algorithms

10.1101/2020.11.10.375923 ◽

2020 ◽

Author(s):

Yoonjee Kang ◽

Denis Thieffry ◽

Laura Cantini

Keyword(s):

Single Cell ◽

Network Inference ◽

Simulated Data ◽

Ground Truth ◽

Real Data ◽

Gene Regulatory Network Inference ◽

Sequencing Platform ◽

Cell Network ◽

Inference Algorithms ◽

Inference Methods

AbstractNetworks are powerful tools to represent and investigate biological systems. The development of algorithms inferring regulatory interactions from functional genomics data has been an active area of research. With the advent of single-cell RNA-seq data (scRNA-seq), numerous methods specifically designed to take advantage of single-cell datasets have been proposed. However, published benchmarks on single-cell network inference are mostly based on simulated data. Once applied to real data, these benchmarks take into account only a small set of genes and only compare the inferred networks with an imposed ground-truth.Here, we benchmark four single-cell network inference methods based on their reproducibility, i.e. their ability to infer similar networks when applied to two independent datasets for the same biological condition. We tested each of these methods on real data from three biological conditions: human retina, T-cells in colorectal cancer, and human hematopoiesis.GENIE3 results to be the most reproducible algorithm, independently from the single-cell sequencing platform, the cell type annotation system, the number of cells constituting the dataset, or the thresholding applied to the links of the inferred networks. In order to ensure the reproducibility and ease extensions of this benchmark study, we implemented all the analyses in scNET, a Jupyter notebook available at https://github.com/ComputationalSystemsBiology/scNET.

Download Full-text

BRANE Cut: Biologically-Related A priori Network Enhancement with Graph cuts for Gene Regulatory Network Inference

10.1101/032383 ◽

2015 ◽

Author(s):

Aurélie Pirayre ◽

Camille Couprie ◽

Frédérique Bidard ◽

Laurent Duval ◽

Jean-Christophe Pesquet

Keyword(s):

Gene Regulatory Network ◽

Regulatory Network ◽

Gene Networks ◽

Network Inference ◽

State Of The Art ◽

A Priori ◽

Graph Cuts ◽

Gene Regulatory Network Inference ◽

Gene Regulatory ◽

Inference Methods

Background: Inferring gene networks from high-throughput data constitutes an important step in the discovery of relevant regulatory relationships in organism cells. Despite the large number of available Gene Regulatory Network inference methods, the problem remains challenging: the underdetermination in the space of possible solutions requires additional constraints that incorporate a priori information on gene interactions. Methods: Weighting all possible pairwise gene relationships by a probability of edge presence, we formulate the regulatory network inference as a discrete variational problem on graphs. We enforce biologically plausible coupling between groups and types of genes by minimizing an edge labeling functional coding for a priori structures. The optimization is carried out with Graph cuts, an approach popular in image processing and computer vision. We compare the inferred regulatory networks to results achieved by the mutual-information-based Context Likelihood of Relatedness (CLR) method and by the state-of-the-art GENIE3, winner of the DREAM4 multifactorial challenge. Results: Our BRANE Cut approach infers more accurately the five DREAM4 in silico networks (with improvements from 6% to 11%). On a real Escherichia coli compendium, an improvement of 11.8% compared to CLR and 3% compared to GENIE3 is obtained in terms of Area Under Precision-Recall curve. Up to 48 additional verified interactions are obtained over GENIE3 for a given precision. On this dataset involving 4345 genes, our method achieves a performance similar to that of GENIE3, while being more than seven times faster. The BRANE Cut code is available at: http://www-syscom.univ-mlv.fr/~pirayre/Codes-GRN-BRANE-cut.html Conclusions: BRANE Cut is a weighted graph thresholding method. Using biologically sound penalties and data-driven parameters, it improves three state-of-the-art GRN inference methods. It is applicable as a generic network inference post-processing, due its computational efficiency.

Download Full-text

SimiC: A Single Cell Gene Regulatory Network Inference method with Similarity Constraints

10.1101/2020.04.03.023002 ◽

2020 ◽

Author(s):

Jianhao Peng ◽

Ullas V. Chembazhi ◽

Sushant Bangru ◽

Ian M. Traniello ◽

Auinash Kalsotra ◽

...

Keyword(s):

Single Cell ◽

Network Inference ◽

Regional Analysis ◽

Supplementary Information ◽

Inference Method ◽

Gene Regulatory Network Inference ◽

Inference Problem ◽

Cell State ◽

Gene Regulatory ◽

Inference Methods

AbstractMotivationWith the use of single-cell RNA sequencing (scRNA-Seq) technologies, it is now possible to acquire gene expression data for each individual cell in samples containing up to millions of cells. These cells can be further grouped into different states along an inferred cell differentiation path, which are potentially characterized by similar, but distinct enough, gene regulatory networks (GRNs). Hence, it would be desirable for scRNA-Seq GRN inference methods to capture the GRN dynamics across cell states. However, current GRN inference methods produce a unique GRN per input dataset (or independent GRNs per cell state), failing to capture these regulatory dynamics.ResultsWe propose a novel single-cell GRN inference method, named SimiC, that jointly infers the GRNs corresponding to each state. SimiC models the GRN inference problem as a LASSO optimization problem with an added similarity constraint, on the GRNs associated to contiguous cell states, that captures the inter-cell-state homogeneity. We show on a mouse hepatocyte single-cell data generated after partial hepatectomy that, contrary to previous GRN methods for scRNA-Seq data, SimiC is able to capture the transcription factor (TF) dynamics across liver regeneration, as well as the cell-level behavior for the regulatory program of each TF across cell states. In addition, on a honey bee scRNA-Seq experiment, SimiC is able to capture the increased heterogeneity of cells on whole-brain tissue with respect to a regional analysis tissue, and the TFs associated specifically to each sequenced tissue.AvailabilitySimiC is written in Python and includes an R API. It can be downloaded from https://github.com/jianhao2016/[email protected], [email protected] informationSupplementary data are available at the code repository.

Download Full-text