scholarly journals An Algebraic Graphical Model for Decision with Uncertainties, Feasibilities, and Utilities

2007 ◽  
Vol 29 ◽  
pp. 421-489 ◽  
Author(s):  
C. Pralet ◽  
G. Verfaillie ◽  
T. Schiex

Numerous formalisms and dedicated algorithms have been designed in the last decades to model and solve decision making problems. Some formalisms, such as constraint networks, can express "simple" decision problems, while others are designed to take into account uncertainties, unfeasible decisions, and utilities. Even in a single formalism, several variants are often proposed to model different types of uncertainty (probability, possibility...) or utility (additive or not). In this article, we introduce an algebraic graphical model that encompasses a large number of such formalisms: (1) we first adapt previous structures from Friedman, Chu and Halpern for representing uncertainty, utility, and expected utility in order to deal with generic forms of sequential decision making; (2) on these structures, we then introduce composite graphical models that express information via variables linked by "local" functions, thanks to conditional independence; (3) on these graphical models, we finally define a simple class of queries which can represent various scenarios in terms of observabilities and controllabilities. A natural decision-tree semantics for such queries is completed by an equivalent operational semantics, which induces generic algorithms. The proposed framework, called the Plausibility-Feasibility-Utility (PFU) framework, not only provides a better understanding of the links between existing formalisms, but it also covers yet unpublished frameworks (such as possibilistic influence diagrams) and unifies formalisms such as quantified boolean formulas and influence diagrams. Our backtrack and variable elimination generic algorithms are a first step towards unified algorithms.

Author(s):  
Layton Hayes ◽  
Prashant Doshi ◽  
Swaraj Pawar ◽  
Hari Teja Tatavarti

The sum-product network (SPN) has been extended to model sequence data with the recurrent SPN (RSPN), and to decision-making problems with sum-product-max networks (SPMN). In this paper, we build on the concepts introduced by these extensions and present state-based recurrent SPMNs (S-RSPMNs) as a generalization of SPMNs to sequential decision-making problems where the state may not be perfectly observed. As with recurrent SPNs, S-RSPMNs utilize a repeatable template network to model sequences of arbitrary lengths. We present an algorithm for learning compact template structures by identifying unique belief states and the transitions between them through a state matching process that utilizes augmented data. In our knowledge, this is the first data-driven approach that learns graphical models for planning under partial observability, which can be solved efficiently. S-RSPMNs retain the linear solution complexity of SPMNs, and we demonstrate significant improvements in compactness of representation and the run time of structure learning and inference in sequential domains.


Author(s):  
Yinghui Pan ◽  
Jing Tang ◽  
Biyang Ma ◽  
Yifeng Zeng ◽  
Zhong Ming

AbstractWith the availability of significant amount of data, data-driven decision making becomes an alternative way for solving complex multiagent decision problems. Instead of using domain knowledge to explicitly build decision models, the data-driven approach learns decisions (probably optimal ones) from available data. This removes the knowledge bottleneck in the traditional knowledge-driven decision making, which requires a strong support from domain experts. In this paper, we study data-driven decision making in the context of interactive dynamic influence diagrams (I-DIDs)—a general framework for multiagent sequential decision making under uncertainty. We propose a data-driven framework to solve the I-DIDs model and focus on learning the behavior of other agents in problem domains. The challenge is on learning a complete policy tree that will be embedded in the I-DIDs models due to limited data. We propose two new methods to develop complete policy trees for the other agents in the I-DIDs. The first method uses a simple clustering process, while the second one employs sophisticated statistical checks. We analyze the proposed algorithms in a theoretical way and experiment them over two problem domains.


Author(s):  
Luis Enrique Sucar

In this chapter we will cover the fundamentals of probabilistic graphical models, in particular Bayesian networks and influence diagrams, which are the basis for some of the techniques and applications that are described in the rest of the book. First we will give a general introduction to probabilistic graphical models, including the motivation for using these models, and a brief history and general description of the main types of models. We will also include a brief review of the basis of probability theory. The core of the chapter will be the next three sections devoted to: (i) Bayesian networks, (ii) Dynamic Bayesian networks and (iii) Influence diagrams. For each we will introduce the models, their properties and give some examples. We will briefly describe the main inference techniques for the three types of models. For Bayesian and dynamic Bayesian nets we will talk about learning, including structure and parameter learning, describing the main types of approaches. At the end of the section on influence diagrams we will briefly introduce sequential decision problems as a link to the chapter on MDPs and POMDPs. We conclude the chapter with a summary and pointers for further reading for each topic.


Author(s):  
Ming-Sheng Ying ◽  
Yuan Feng ◽  
Sheng-Gang Ying

AbstractMarkov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.


Sign in / Sign up

Export Citation Format

Share Document