scholarly journals Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min(Max) Polynomial Bellman Equations

2020 ◽  
Vol 45 (1) ◽  
pp. 34-62 ◽  
Author(s):  
Kousha Etessami ◽  
Alistair Stewart ◽  
Mihalis Yannakakis
2008 ◽  
Vol 19 (03) ◽  
pp. 549-563 ◽  
Author(s):  
LAURENT DOYEN ◽  
THOMAS A. HENZINGER ◽  
JEAN-FRANÇOIS RASKIN

We consider the equivalence problem for labeled Markov chains (LMCs), where each state is labeled with an observation. Two LMCs are equivalent if every finite sequence of observations has the same probability of occurrence in the two LMCs. We show that equivalence can be decided in polynomial time, using a reduction to the equivalence problem for probabilistic automata, which is known to be solvable in polynomial time. We provide an alternative algorithm to solve the equivalence problem, which is based on a new definition of bisimulation for probabilistic automata. We also extend the technique to decide the equivalence of weighted probabilistic automata. Then, we consider the equivalence problem for labeled Markov decision processes (LMDPs), which asks given two LMDPs whether for every scheduler (i.e. way of resolving the nondeterministic decisions) for each of the processes, there exists a scheduler for the other process such that the resulting LMCs are equivalent. The decidability of this problem remains open. We show that the schedulers can be restricted to be observation-based, but may require infinite memory.


2011 ◽  
Vol 48 (04) ◽  
pp. 954-967 ◽  
Author(s):  
Chin Hon Tan ◽  
Joseph C. Hartman

Sequential decision problems can often be modeled as Markov decision processes. Classical solution approaches assume that the parameters of the model are known. However, model parameters are usually estimated and uncertain in practice. As a result, managers are often interested in how estimation errors affect the optimal solution. In this paper we illustrate how sensitivity analysis can be performed directly for a Markov decision process with uncertain reward parameters using the Bellman equations. In particular, we consider problems involving (i) a single stationary parameter, (ii) multiple stationary parameters, and (iii) multiple nonstationary parameters. We illustrate the applicability of this work through a capacitated stochastic lot-sizing problem.


2011 ◽  
Vol 48 (4) ◽  
pp. 954-967 ◽  
Author(s):  
Chin Hon Tan ◽  
Joseph C. Hartman

Sequential decision problems can often be modeled as Markov decision processes. Classical solution approaches assume that the parameters of the model are known. However, model parameters are usually estimated and uncertain in practice. As a result, managers are often interested in how estimation errors affect the optimal solution. In this paper we illustrate how sensitivity analysis can be performed directly for a Markov decision process with uncertain reward parameters using the Bellman equations. In particular, we consider problems involving (i) a single stationary parameter, (ii) multiple stationary parameters, and (iii) multiple nonstationary parameters. We illustrate the applicability of this work through a capacitated stochastic lot-sizing problem.


2021 ◽  
Vol Volume 17, Issue 4 ◽  
Author(s):  
Jeremy Sproston

Clock-dependent probabilistic timed automata extend classical timed automata with discrete probabilistic choice, where the probabilities are allowed to depend on the exact values of the clocks. Previous work has shown that the quantitative reachability problem for clock-dependent probabilistic timed automata with at least three clocks is undecidable. In this paper, we consider the subclass of clock-dependent probabilistic timed automata that have one clock, that have clock dependencies described by affine functions, and that satisfy an initialisation condition requiring that, at some point between taking edges with non-trivial clock dependencies, the clock must have an integer value. We present an approach for solving in polynomial time quantitative and qualitative reachability problems of such one-clock initialised clock-dependent probabilistic timed automata. Our results are obtained by a transformation to interval Markov decision processes.


1983 ◽  
Vol 20 (04) ◽  
pp. 835-842
Author(s):  
David Assaf

The paper presents sufficient conditions for certain functions to be convex. Functions of this type often appear in Markov decision processes, where their maximum is the solution of the problem. Since a convex function takes its maximum at an extreme point, the conditions may greatly simplify a problem. In some cases a full solution may be obtained after the reduction is made. Some illustrative examples are discussed.


Sign in / Sign up

Export Citation Format

Share Document