scholarly journals Learning Diverse Bayesian Networks

Author(s):  
Cong Chen ◽  
Changhe Yuan

Much effort has been directed at developing algorithms for learning optimal Bayesian network structures from data. When given limited or noisy data, however, the optimal Bayesian network often fails to capture the true underlying network structure. One can potentially address the problem by finding multiple most likely Bayesian networks (K-Best) in the hope that one of them recovers the true model. However, it is often the case that some of the best models come from the same peak(s) and are very similar to each other; so they tend to fail together. Moreover, many of these models are not even optimal respective to any causal ordering, thus unlikely to be useful. This paper proposes a novel method for finding a set of diverse top Bayesian networks, called modes, such that each network is guaranteed to be optimal in a local neighborhood. Such mode networks are expected to provide a much better coverage of the true model. Based on a globallocal theorem showing that a mode Bayesian network must be optimal in all local scopes, we introduce an A* search algorithm to efficiently find top M Bayesian networks which are highly probable and naturally diverse. Empirical evaluations show that our top mode models have much better diversity as well as accuracy in discovering true underlying models than those found by K-Best.

Author(s):  
Shahab Wahhab Kareem ◽  
Mehmet Cudi Okur

Bayesian networks are useful analytical models for designing the structure of knowledge in machine learning which can represent probabilistic dependency relationships among the variables. The authors present the Elephant Swarm Water Search Algorithm (ESWSA) for Bayesian network structure learning. In the algorithm; Deleting, Reversing, Inserting, and Moving are used to make the ESWSA for reaching the optimal structure solution. Mainly, water search strategy of elephants during drought periods is used in the ESWSA algorithm. The proposed method is compared with Pigeon Inspired Optimization, Simulated Annealing, Greedy Search, Hybrid Bee with Simulated Annealing, and Hybrid Bee with Greedy Search using BDeu score function as a metric for all algorithms. They investigated the confusion matrix performances of these techniques utilizing various benchmark data sets. As presented by the results of evaluations, the proposed algorithm achieves better performance than the other algorithms and produces better scores as well as the better values.


2007 ◽  
Vol 28 ◽  
pp. 1-48 ◽  
Author(s):  
B. Bidyuk ◽  
R. Dechter

The paper presents a new sampling methodology for Bayesian networks that samples only a subset of variables and applies exact inference to the rest. Cutset sampling is a network structure-exploiting application of the Rao-Blackwellisation principle to sampling in Bayesian networks. It improves convergence by exploiting memory-based inference algorithms. It can also be viewed as an anytime approximation of the exact cutset-conditioning algorithm developed by Pearl. Cutset sampling can be implemented efficiently when the sampled variables constitute a loop-cutset of the Bayesian network and, more generally, when the induced width of the network's graph conditioned on the observed sampled variables is bounded by a constant w. We demonstrate empirically the benefit of this scheme on a range of benchmarks.


Information ◽  
2019 ◽  
Vol 10 (10) ◽  
pp. 294 ◽  
Author(s):  
Xingping Sun ◽  
Chang Chen ◽  
Lu Wang ◽  
Hongwei Kang ◽  
Yong Shen ◽  
...  

Since the beginning of the 21st century, research on artificial intelligence has made great progress. Bayesian networks have gradually become one of the hotspots and important achievements in artificial intelligence research. Establishing an effective Bayesian network structure is the foundation and core of the learning and application of Bayesian networks. In Bayesian network structure learning, the traditional method of utilizing expert knowledge to construct the network structure is gradually replaced by the data learning structure method. However, as a result of the large amount of possible network structures, the search space is too large. The method of Bayesian network learning through training data usually has the problems of low precision or high complexity, which make the structure of learning differ greatly from that of reality, which has a great influence on the reasoning and practical application of Bayesian networks. In order to solve this problem, a hybrid optimization artificial bee colony algorithm is discretized and applied to structure learning. A hybrid optimization technique for the Bayesian network structure learning method is proposed. Experimental simulation results show that the proposed hybrid optimization structure learning algorithm has better structure and better convergence.


2017 ◽  
Author(s):  
Qingyang Zhang ◽  
Xuan Shi

AbstractGaussian Bayesian networks have become a widely used framework to estimate directed associations between joint Gaussian variables, where the network structure encodes decomposition of multivariate normal density into local terms. However, the resulting estimates can be inaccurate when normality assumption is moderately or severely violated, making it unsuitable to deal with recent genomic data such as the Cancer Genome Atlas data. In the present paper, we propose a mixture copula Bayesian network model which provides great flexibility in modeling non-Gaussian and multimodal data for causal inference. The parameters in mixture copula functions can be efficiently estimated by a routine Expectation-Maximization algorithm. A heuristic search algorithm based on Bayesian information criterion is developed to estimate the network structure, and prediction can be further improved by the best-scoring network out of multiple predictions from random initial values. Our method outperforms Gaussian Bayesian networks and regular copula Bayesian networks in terms of modeling flexibility and prediction accuracy, as demonstrated using a cell signaling dataset. We apply the proposed methods to the Cancer Genome Atlas data to study the genetic and epigenetic pathways that underlie serous ovarian cancer.


2017 ◽  
Vol 16 ◽  
pp. 117693511770238 ◽  
Author(s):  
Qingyang Zhang ◽  
Xuan Shi

Gaussian Bayesian networks have become a widely used framework to estimate directed associations between joint Gaussian variables, where the network structure encodes the decomposition of multivariate normal density into local terms. However, the resulting estimates can be inaccurate when the normality assumption is moderately or severely violated, making it unsuitable for dealing with recent genomic data such as the Cancer Genome Atlas data. In the present paper, we propose a mixture copula Bayesian network model which provides great flexibility in modeling non-Gaussian and multimodal data for causal inference. The parameters in mixture copula functions can be efficiently estimated by a routine expectation–maximization algorithm. A heuristic search algorithm based on Bayesian information criterion is developed to estimate the network structure, and prediction can be further improved by the best-scoring network out of multiple predictions from random initial values. Our method outperforms Gaussian Bayesian networks and regular copula Bayesian networks in terms of modeling flexibility and prediction accuracy, as demonstrated using a cell signaling data set. We apply the proposed methods to the Cancer Genome Atlas data to study the genetic and epigenetic pathways that underlie serous ovarian cancer.


2011 ◽  
Vol 40 ◽  
pp. 729-765 ◽  
Author(s):  
W. Li ◽  
P. Poupart ◽  
P. Van Beek

Previous studies have demonstrated that encoding a Bayesian network into a SAT formula and then performing weighted model counting using a backtracking search algorithm can be an effective method for exact inference. In this paper, we present techniques for improving this approach for Bayesian networks with noisy-OR and noisy-MAX relations---two relations that are widely used in practice as they can dramatically reduce the number of probabilities one needs to specify. In particular, we present two SAT encodings for noisy-OR and two encodings for noisy-MAX that exploit the structure or semantics of the relations to improve both time and space efficiency, and we prove the correctness of the encodings. We experimentally evaluated our techniques on large-scale real and randomly generated Bayesian networks. On these benchmarks, our techniques gave speedups of up to two orders of magnitude over the best previous approaches for networks with noisy-OR/MAX relations and scaled up to larger networks. As well, our techniques extend the weighted model counting approach for exact inference to networks that were previously intractable for the approach.


Sign in / Sign up

Export Citation Format

Share Document