Learning Bayesian Networks

Author(s):  
Marco F. Ramoni ◽  
Paola Sebastiani

Born at the intersection of artificial intelligence, statistics, and probability, Bayesian networks (Pearl, 1988) are a representation formalism at the cutting edge of knowledge discovery and data mining (Heckerman, 1997). Bayesian networks belong to a more general class of models called probabilistic graphical models (Whittaker, 1990; Lauritzen, 1996) that arise from the combination of graph theory and probability theory, and their success rests on their ability to handle complex probabilistic models by decomposing them into smaller, amenable components. A probabilistic graphical model is defined by a graph, where nodes represent stochastic variables and arcs represent dependencies among such variables. These arcs are annotated by probability distribution shaping the interaction between the linked variables. A probabilistic graphical model is called a Bayesian network, when the graph connecting its variables is a directed acyclic graph (DAG). This graph represents conditional independence assumptions that are used to factorize the joint probability distribution of the network variables, thus making the process of learning from a large database amenable to computations. A Bayesian network induced from data can be used to investigate distant relationships between variables, as well as making prediction and explanation, by computing the conditional probability distribution of one variable, given the values of some others.

Author(s):  
Marco F. Ramoni ◽  
Paola Sebastiani

Born at the intersection of artificial intelligence, statistics, and probability, Bayesian networks (Pearl, 1988) are a representation formalism at the cutting edge of knowledge discovery and data mining (Heckerman, 1997). Bayesian networks belong to a more general class of models called probabilistic graphical models (Whittaker, 1990; Lauritzen, 1996) that arise from the combination of graph theory and probability theory, and their success rests on their ability to handle complex probabilistic models by decomposing them into smaller, amenable components. A probabilistic graphical model is defined by a graph, where nodes represent stochastic variables and arcs represent dependencies among such variables. These arcs are annotated by probability distribution shaping the interaction between the linked variables. A probabilistic graphical model is called a Bayesian network, when the graph connecting its variables is a directed acyclic graph (DAG). This graph represents conditional independence assumptions that are used to factorize the joint probability distribution of the network variables, thus making the process of learning from a large database amenable to computations. A Bayesian network induced from data can be used to investigate distant relationships between variables, as well as making prediction and explanation, by computing the conditional probability distribution of one variable, given the values of some others.


Author(s):  
Marco F. Ramoni ◽  
Paola Sebastiani

Born at the intersection of artificial intelligence, statistics, and probability, Bayesian networks (Pearl, 1988) are a representation formalism at the cutting edge of knowledge discovery and data mining (Heckerman, 1997). Bayesian networks belong to a more general class of models called probabilistic graphical models (Whittaker, 1990; Lauritzen, 1996) that arise from the combination of graph theory and probability theory, and their success rests on their ability to handle complex probabilistic models by decomposing them into smaller, amenable components. A probabilistic graphical model is defined by a graph, where nodes represent stochastic variables and arcs represent dependencies among such variables. These arcs are annotated by probability distribution shaping the interaction between the linked variables. A probabilistic graphical model is called a Bayesian network, when the graph connecting its variables is a directed acyclic graph (DAG). This graph represents conditional independence assumptions that are used to factorize the joint probability distribution of the network variables, thus making the process of learning from a large database amenable to computations. A Bayesian network induced from data can be used to investigate distant relationships between variables, as well as making prediction and explanation, by computing the conditional probability distribution of one variable, given the values of some others.


Author(s):  
Yang Xiang

Graphical models such as Bayesian networks (BNs) (Pearl, 1988) and decomposable Markov networks (DMNs) (Xiang, Wong & Cercone, 1997) have been applied widely to probabilistic reasoning in intelligent systems. Figure1 illustrates a BN and a DMN on a trivial uncertain domain: A virus can damage computer files, and so can a power glitch. A power glitch also causes a VCR to reset. The BN in (a) has four nodes, corresponding to four binary variables taking values from {true, false}. The graph structure encodes a set of dependence and independence assumptions (e.g., that f is directly dependent on v, and p but is independent of r, once the value of p is known). Each node is associated with a conditional probability distribution conditioned on its parent nodes (e.g., P(f | v, p)). The joint probability distribution is the product P(v, p, f, r) = P(f | v, p) P(r | p) P(v) P(p). The DMN in (b) has two groups of nodes that are maximally pair-wise connected, called cliques. Each clique is associated with a probability distribution (e.g., clique {v, p, f} is assigned P(v, p, f)). The joint probability distribution is P(v, p, f, r) = P(v, p, f) P(r, p) / P(p), where P(p) can be derived from one of the clique distributions. The networks, for instance, can be used to reason about whether there are viruses in the computer system, after observations on f and r are made.


2014 ◽  
Vol 926-930 ◽  
pp. 3594-3597
Author(s):  
Cai Chang Ding ◽  
Wen Xiu Peng ◽  
Wei Ming Wang

Estimation of Distribution Algorithms (EDAs) are a set of algorithms that belong to the field of Evolutionary Computation. In EDAs there are neither crossover nor mutation operators. Instead, the new population of individuals is sampled from a probability distribution, which is estimated from a database that contains the selected individuals from the previous generation. Thus, the interrelations between the different variables that represent the individuals may be explicitly expressed through the joint probability distribution associated with the individuals selected at each generation.


Author(s):  
Dimitris Margaritis ◽  
Christos Faloutsos ◽  
Sebastian Thrun

We present a novel method for answering count queries from a large database approximately and quickly. Our method implements an approximate DataCube of the application domain, which can be used to answer any conjunctive count query that can be formed by the user. The DataCube is a conceptual device that in principle stores the number of matching records for all possible such queries. However, because its size and generation time are inherently exponential, our approach uses one or more Bayesian networks to implement it approximately. Bayesian networks are statistical graphical models that can succinctly represent the underlying joint probability distribution of the domain, and can therefore be used to calculate approximate counts for any conjunctive query combination of attribute values and “don’t cares.” The structure and parameters of these networks are learned from the database in a preprocessing stage. By means of such a network, the proposed method, called NetCube, exploits correlations and independencies among attributes to answer a count query quickly without accessing the database. Our preprocessing algorithm scales linearly on the size of the database, and is thus scalable; it is also parallelizable with a straightforward parallel implementation. We give an algorithm for estimating the count result of arbitrary que ries that is fast (constant) on the database size. Our experimental results show that NetCubes have fast generation and use, achieve excellent compression and have low reconstruction error. Moreover, they naturally allow for visualization and data mining, at no extra cost.


2009 ◽  
pp. 2011-2036
Author(s):  
Dimitris Margaritis ◽  
Christos Faloutsos ◽  
Sebastian Thrun

We present a novel method for answering count queries from a large database approximately and quickly. Our method implements an approximate DataCube of the application domain, which can be used to answer any conjunctive count query that can be formed by the user. The DataCube is a conceptual device that in principle stores the number of matching records for all possible such queries. However, because its size and generation time are inherently exponential, our approach uses one or more Bayesian networks to implement it approximately. Bayesian networks are statistical graphical models that can succinctly represent the underlying joint probability distribution of the domain, and can therefore be used to calculate approximate counts for any conjunctive query combination of attribute values and “don’t cares.” The structure and parameters of these networks are learned from the database in a preprocessing stage. By means of such a network, the proposed method, called NetCube, exploits correlations and independencies among attributes to answer a count query quickly without accessing the database. Our preprocessing algorithm scales linearly on the size of the database, and is thus scalable; it is also parallelizable with a straightforward parallel implementation. We give an algorithm for estimating the count result of arbitrary queries that is fast (constant) on the database size. Our experimental results show that NetCubes have fast generation and use, achieve excellent compression and have low reconstruction error. Moreover, they naturally allow for visualization and data mining, at no extra cost.


Author(s):  
Dimitris Margaritis ◽  
Christos Faloutsos ◽  
Sebastian Thrun

We present a novel method for answering count queries from a large database approximately and quickly. Our method implements an approximate DataCube of the application domain, which can be used to answer any conjunctive count query that can be formed by the user. The DataCube is a conceptual device that in principle stores the number of matching records for all possible such queries. However, because its size and generation time are inherently exponential, our approach uses one or more Bayesian networks to implement it approximately. Bayesian networks are statistical graphical models that can succinctly represent the underlying joint probability distribution of the domain, and can therefore be used to calculate approximate counts for any conjunctive query combination of attribute values and “don’t cares.” The structure and parameters of these networks are learned from the database in a preprocessing stage. By means of such a network, the proposed method, called NetCube, exploits correlations and independencies among attributes to answer a count query quickly without accessing the database. Our preprocessing algorithm scales linearly on the size of the database, and is thus scalable; it is also parallelizable with a straightforward parallel implementation. We give an algorithm for estimating the count result of arbitrary queries that is fast (constant) on the database size. Our experimental results show that NetCubes have fast generation and use, achieve excellent compression and have low reconstruction error. Moreover, they naturally allow for visualization and data mining, at no extra cost.


Information ◽  
2018 ◽  
Vol 9 (9) ◽  
pp. 211
Author(s):  
David Kinney

This article considers the extent to which Bayesian networks with imprecise probabilities, which are used in statistics and computer science for predictive purposes, can be used to represent causal structure. It is argued that the adequacy conditions for causal representation in the precise context—the Causal Markov Condition and Minimality—do not readily translate into the imprecise context. Crucial to this argument is the fact that the independence relation between random variables can be understood in several different ways when the joint probability distribution over those variables is imprecise, none of which provides a compelling basis for the causal interpretation of imprecise Bayes nets. I conclude that there are serious limits to the use of imprecise Bayesian networks to represent causal structure.


2018 ◽  
Vol 2 (1) ◽  
pp. 62
Author(s):  
Hasniati Hasniati ◽  
Arianti Arianti ◽  
William Philip

Bayesian Network dapat digunakan untuk menghitung probabilitas dari kehadiran berbagai gejala penyakit. Dalam tulisan ini, penulis menerapkan bayesian network model untuk menghitung probabilitas penyakit sesak nafas pada bayi. Bayesian network diterapkan berdasar pada data yang diperoleh melalui wawancara kepada dokter spesialis anak yaitu data nama penyakit, penyebab, dan gejala penyakit sesak nafas pada bayi. Struktur Bayesian Network penyakit sesak nafas bayi dibuat berdasarkan ada tidaknya keterkaitan antara gejala terhadap penyakit sesak nafas. Untuk setiap gejala yang direpresentasikan pada struktur bayesian network mempunyai estimasi parameter yang didapat dari data yang telah ada atau pengetahuan dari dokter spesialis. Data estimasi ini disebut nilai prior probaility atau nilai kepercayaan dari gejala penyakit sesak nafas bayi. Setelah diketahui prior probability, langkah berikutnya adalah menentukan Conditional probability (peluang bersyarat) antara jenis penyakit sesak nafas dengan masing-masing gejalanya. Pada langkah akhir, nilai posterior probability dihitung dengan mengambil nilai hasil joint probability distribution (JPD) yang telah diperoleh, kemudian nilai inilah yang digunakan untuk menghitung probabilitas kemunculan suatu gejala. Dengan mengambil satu contoh kasus bahwa bayi memiliki gejala sesak, lemah, gelisah dan demam, disimpulkan bahwa bayi menderita penyakit sesak nafas Pneumoni Neonatal sebesar 0,1688812743.


2018 ◽  
Vol 63 ◽  
pp. 421-460
Author(s):  
Kathryn Blackmond Laskey ◽  
Wei Sun ◽  
Robin Hanson ◽  
Charles Twardy ◽  
Shou Matsumoto ◽  
...  

We describe algorithms for use by prediction markets in forming a crowd consensus joint probability distribution over thousands of related events. Equivalently, we describe market mechanisms to efficiently crowdsource both structure and parameters of a Bayesian network. Prediction markets are among the most accurate methods to combine forecasts; forecasters form a consensus probability distribution by trading contingent securities. A combinatorial prediction market forms a consensus joint distribution over many related events by allowing conditional trades or trades on Boolean combinations of events. Explicitly representing the joint distribution is infeasible, but standard inference algorithms for graphical probability models render it tractable for large numbers of base events. We show how to adapt these algorithms to compute expected assets conditional on a prospective trade, and to find the conditional state where a trader has minimum assets, allowing full asset reuse. We compare the performance of three algorithms: the straightforward algorithm from the DAGGRE (Decomposition-Based Aggregation) prediction market for geopolitical events, the simple block-merge model from the SciCast market for science and technology forecasting, and a more sophisticated algorithm we developed for future markets.


Sign in / Sign up

Export Citation Format

Share Document