Learning probabilistic networks

1999 ◽  
Vol 13 (4) ◽  
pp. 321-351 ◽  
Author(s):  
PAUL J. KRAUSE

A probabilistic network is a graphical model that encodes probabilistic relationships between variables of interest. Such a model records qualitative influences between variables in addition to the numerical parameters of the probability distribution. As such it provides an ideal form for combining prior knowledge, which might be limited solely to experience of the influences between some of the variables of interest, and data. In this paper, we first show how data can be used to revise initial estimates of the parameters of a model. We then progress to showing how the structure of the model can be revised as data is obtained. Techniques for learning with incomplete data are also covered. In order to make the paper as self contained as possible, we start with an introduction to probability theory and probabilistic graphical models. The paper concludes with a short discussion on how these techniques can be applied to the problem of learning causal relationships between variables in a domain of interest.

Author(s):  
Marco F. Ramoni ◽  
Paola Sebastiani

Born at the intersection of artificial intelligence, statistics, and probability, Bayesian networks (Pearl, 1988) are a representation formalism at the cutting edge of knowledge discovery and data mining (Heckerman, 1997). Bayesian networks belong to a more general class of models called probabilistic graphical models (Whittaker, 1990; Lauritzen, 1996) that arise from the combination of graph theory and probability theory, and their success rests on their ability to handle complex probabilistic models by decomposing them into smaller, amenable components. A probabilistic graphical model is defined by a graph, where nodes represent stochastic variables and arcs represent dependencies among such variables. These arcs are annotated by probability distribution shaping the interaction between the linked variables. A probabilistic graphical model is called a Bayesian network, when the graph connecting its variables is a directed acyclic graph (DAG). This graph represents conditional independence assumptions that are used to factorize the joint probability distribution of the network variables, thus making the process of learning from a large database amenable to computations. A Bayesian network induced from data can be used to investigate distant relationships between variables, as well as making prediction and explanation, by computing the conditional probability distribution of one variable, given the values of some others.


Author(s):  
Luis Enrique Sucar

In this chapter we will cover the fundamentals of probabilistic graphical models, in particular Bayesian networks and influence diagrams, which are the basis for some of the techniques and applications that are described in the rest of the book. First we will give a general introduction to probabilistic graphical models, including the motivation for using these models, and a brief history and general description of the main types of models. We will also include a brief review of the basis of probability theory. The core of the chapter will be the next three sections devoted to: (i) Bayesian networks, (ii) Dynamic Bayesian networks and (iii) Influence diagrams. For each we will introduce the models, their properties and give some examples. We will briefly describe the main inference techniques for the three types of models. For Bayesian and dynamic Bayesian nets we will talk about learning, including structure and parameter learning, describing the main types of approaches. At the end of the section on influence diagrams we will briefly introduce sequential decision problems as a link to the chapter on MDPs and POMDPs. We conclude the chapter with a summary and pointers for further reading for each topic.


Author(s):  
Yang Xiang

Graphical models such as Bayesian networks (BNs) (Pearl, 1988; Jensen & Nielsen, 2007) and decomposable Markov networks (DMNs) (Xiang, Wong., & Cercone, 1997) have been widely applied to probabilistic reasoning in intelligent systems. Knowledge representation using such models for a simple problem domain is illustrated in Figure 1: Virus can damage computer files and so can a power glitch. Power glitch also causes a VCR to reset. Links and lack of them convey dependency and independency relations among these variables and the strength of each link is quantified by a probability distribution. The networks are useful for inferring whether the computer has virus after checking files and VCR. This chapter considers how to discover them from data. Discovery of graphical models (Neapolitan, 2004) by testing all alternatives is intractable. Hence, heuristic search are commonly applied (Cooper & Herskovits, 1992; Spirtes, Glymour, & Scheines, 1993; Lam & Bacchus, 1994; Heckerman, Geiger, & Chickering, 1995; Friedman, Geiger, & Goldszmidt, 1997; Xiang, Wong, & Cercone, 1997). All heuristics make simplifying assumptions about the unknown data-generating models. These assumptions preclude certain models to gain efficiency. Often assumptions and models they exclude are not explicitly stated. Users of such heuristics may suffer from such exclusion without even knowing. This chapter examines assumptions underlying common heuristics and their consequences to graphical model discovery. A decision theoretic strategy for choosing heuristics is introduced that can take into account a full range of consequences (including efficiency in discovery, efficiency in inference using the discovered model, and cost of inference with an incorrectly discovered model) and resolve the above issue.


Author(s):  
Marco F. Ramoni ◽  
Paola Sebastiani

Born at the intersection of artificial intelligence, statistics, and probability, Bayesian networks (Pearl, 1988) are a representation formalism at the cutting edge of knowledge discovery and data mining (Heckerman, 1997). Bayesian networks belong to a more general class of models called probabilistic graphical models (Whittaker, 1990; Lauritzen, 1996) that arise from the combination of graph theory and probability theory, and their success rests on their ability to handle complex probabilistic models by decomposing them into smaller, amenable components. A probabilistic graphical model is defined by a graph, where nodes represent stochastic variables and arcs represent dependencies among such variables. These arcs are annotated by probability distribution shaping the interaction between the linked variables. A probabilistic graphical model is called a Bayesian network, when the graph connecting its variables is a directed acyclic graph (DAG). This graph represents conditional independence assumptions that are used to factorize the joint probability distribution of the network variables, thus making the process of learning from a large database amenable to computations. A Bayesian network induced from data can be used to investigate distant relationships between variables, as well as making prediction and explanation, by computing the conditional probability distribution of one variable, given the values of some others.


2014 ◽  
Vol 3 (22) ◽  
pp. 101 ◽  
Author(s):  
Alena Vladimirovna Suvorova ◽  
Tatiana Valentinovna Tulupyeva ◽  
Alexander Lvovich Tulupyev ◽  
Alexander Vladimirovich Sirotkin ◽  
Anton Evgen’evich Paschenko

Author(s):  
Marco F. Ramoni ◽  
Paola Sebastiani

Born at the intersection of artificial intelligence, statistics, and probability, Bayesian networks (Pearl, 1988) are a representation formalism at the cutting edge of knowledge discovery and data mining (Heckerman, 1997). Bayesian networks belong to a more general class of models called probabilistic graphical models (Whittaker, 1990; Lauritzen, 1996) that arise from the combination of graph theory and probability theory, and their success rests on their ability to handle complex probabilistic models by decomposing them into smaller, amenable components. A probabilistic graphical model is defined by a graph, where nodes represent stochastic variables and arcs represent dependencies among such variables. These arcs are annotated by probability distribution shaping the interaction between the linked variables. A probabilistic graphical model is called a Bayesian network, when the graph connecting its variables is a directed acyclic graph (DAG). This graph represents conditional independence assumptions that are used to factorize the joint probability distribution of the network variables, thus making the process of learning from a large database amenable to computations. A Bayesian network induced from data can be used to investigate distant relationships between variables, as well as making prediction and explanation, by computing the conditional probability distribution of one variable, given the values of some others.


Author(s):  
Max A. Little

Statistical machine learning and statistical DSP are built on the foundations of probability theory and random variables. Different techniques encode different dependency structure between these variables. This structure leads to specific algorithms for inference and estimation. Many common dependency structures emerge naturally in this way, as a result, there are many common patterns of inference and estimation that suggest general algorithms for this purpose. So, it becomes important to formalize these algorithms; this is the purpose of this chapter. These general algorithms can often lead to substantial computational savings over more brute-force approaches, another benefit that comes from studying the structure of these models in the abstract.


Mathematics ◽  
2021 ◽  
Vol 9 (23) ◽  
pp. 3112
Author(s):  
Jesus Cerquides

Probabilistic graphical models allow us to encode a large probability distribution as a composition of smaller ones. It is oftentimes the case that we are interested in incorporating in the model the idea that some of these smaller distributions are likely to be similar to one another. In this paper we provide an information geometric approach on how to incorporate this information and see that it allows us to reinterpret some already existing models. Our proposal relies on providing a formal definition of what it means to be close. We provide an example on how this definition can be actioned for multinomial distributions. We use the results on multinomial distributions to reinterpret two already existing hierarchical models in terms of closeness distributions.


1997 ◽  
Vol 36 (04/05) ◽  
pp. 41-46
Author(s):  
A. Kjaer ◽  
W. Jensen ◽  
T. Dyrby ◽  
L. Andreasen ◽  
J. Andersen ◽  
...  

Abstract.A new method for sleep-stage classification using a causal probabilistic network as automatic classifier has been implemented and validated. The system uses features from the primary sleep signals from the brain (EEG) and the eyes (AOG) as input. From the EEG, features are derived containing spectral information which is used to classify power in the classical spectral bands, sleep spindles and K-complexes. From AOG, information on rapid eye movements is derived. Features are extracted every 2 seconds. The CPN-based sleep classifier was implemented using the HUGIN system, an application tool to handle causal probabilistic networks. The results obtained using different training approaches show agreements ranging from 68.7 to 70.7% between the system and the two experts when a pooled agreement is computed over the six subjects. As a comparison, the interrater agreement between the two experts was found to be 71.4%, measured also over the six subjects.


Sign in / Sign up

Export Citation Format

Share Document