Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min(Max) Polynomial Bellman Equations

We consider the equivalence problem for labeled Markov chains (LMCs), where each state is labeled with an observation. Two LMCs are equivalent if every finite sequence of observations has the same probability of occurrence in the two LMCs. We show that equivalence can be decided in polynomial time, using a reduction to the equivalence problem for probabilistic automata, which is known to be solvable in polynomial time. We provide an alternative algorithm to solve the equivalence problem, which is based on a new definition of bisimulation for probabilistic automata. We also extend the technique to decide the equivalence of weighted probabilistic automata. Then, we consider the equivalence problem for labeled Markov decision processes (LMDPs), which asks given two LMDPs whether for every scheduler (i.e. way of resolving the nondeterministic decisions) for each of the processes, there exists a scheduler for the other process such that the resulting LMCs are equivalent. The decidability of this problem remains open. We show that the schedulers can be restricted to be observation-based, but may require infinite memory.

Download Full-text

Sensitivity Analysis in Markov Decision Processes with Uncertain Reward Parameters

Journal of Applied Probability ◽

10.1017/s002190020000855x ◽

2011 ◽

Vol 48 (04) ◽

pp. 954-967 ◽

Cited By ~ 1

Author(s):

Chin Hon Tan ◽

Joseph C. Hartman

Keyword(s):

Sensitivity Analysis ◽

Markov Decision Processes ◽

Lot Sizing ◽

Optimal Solution ◽

Decision Processes ◽

Model Parameters ◽

Sequential Decision ◽

Estimation Errors ◽

Bellman Equations ◽

Markov Decision

Sequential decision problems can often be modeled as Markov decision processes. Classical solution approaches assume that the parameters of the model are known. However, model parameters are usually estimated and uncertain in practice. As a result, managers are often interested in how estimation errors affect the optimal solution. In this paper we illustrate how sensitivity analysis can be performed directly for a Markov decision process with uncertain reward parameters using the Bellman equations. In particular, we consider problems involving (i) a single stationary parameter, (ii) multiple stationary parameters, and (iii) multiple nonstationary parameters. We illustrate the applicability of this work through a capacitated stochastic lot-sizing problem.

Download Full-text

Sensitivity Analysis in Markov Decision Processes with Uncertain Reward Parameters

Journal of Applied Probability ◽

10.1239/jap/1324046012 ◽

2011 ◽

Vol 48 (4) ◽

pp. 954-967 ◽

Cited By ~ 7

Author(s):

Chin Hon Tan ◽

Joseph C. Hartman

Keyword(s):

Sensitivity Analysis ◽

Markov Decision Processes ◽

Lot Sizing ◽

Optimal Solution ◽

Decision Processes ◽

Model Parameters ◽

Sequential Decision ◽

Estimation Errors ◽

Bellman Equations ◽

Markov Decision

Sequential decision problems can often be modeled as Markov decision processes. Classical solution approaches assume that the parameters of the model are known. However, model parameters are usually estimated and uncertain in practice. As a result, managers are often interested in how estimation errors affect the optimal solution. In this paper we illustrate how sensitivity analysis can be performed directly for a Markov decision process with uncertain reward parameters using the Bellman equations. In particular, we consider problems involving (i) a single stationary parameter, (ii) multiple stationary parameters, and (iii) multiple nonstationary parameters. We illustrate the applicability of this work through a capacitated stochastic lot-sizing problem.

Download Full-text

Probabilistic Timed Automata with One Clock and Initialised Clock-Dependent Probabilities

Logical Methods in Computer Science ◽

10.46298/lmcs-17(4:6)2021 ◽

2021 ◽

Vol Volume 17, Issue 4 ◽

Author(s):

Jeremy Sproston

Keyword(s):

Markov Decision Processes ◽

Polynomial Time ◽

Timed Automata ◽

Decision Processes ◽

Reachability Problem ◽

Probabilistic Choice ◽

Markov Decision ◽

Affine Functions ◽

Probabilistic Timed Automata

Clock-dependent probabilistic timed automata extend classical timed automata with discrete probabilistic choice, where the probabilities are allowed to depend on the exact values of the clocks. Previous work has shown that the quantitative reachability problem for clock-dependent probabilistic timed automata with at least three clocks is undecidable. In this paper, we consider the subclass of clock-dependent probabilistic timed automata that have one clock, that have clock dependencies described by affine functions, and that satisfy an initialisation condition requiring that, at some point between taking edges with non-trivial clock dependencies, the clock must have an integer value. We present an approach for solving in polynomial time quantitative and qualitative reachability problems of such one-clock initialised clock-dependent probabilistic timed automata. Our results are obtained by a transformation to interval Markov decision processes.

Download Full-text

Learning Control of Dynamical Systems Based on Markov Decision Processes: Research Frontiers and Outlooks

ACTA AUTOMATICA SINICA ◽

10.3724/sp.j.1004.2012.00673 ◽

2012 ◽

Vol 38 (5) ◽

pp. 673-687 ◽

Cited By ~ 1

Author(s):

Xin XU ◽

Dong SHEN ◽

Yan-Qing GAO ◽

Kai WANG

Keyword(s):

Dynamical Systems ◽

Markov Decision Processes ◽

Learning Control ◽

Decision Processes ◽

Markov Decision ◽

Research Frontiers

Download Full-text

A Framework for Modeling Bounded Rationality: Mis-Specified Bayesian-Markov Decision Processes

SSRN Electronic Journal ◽

10.2139/ssrn.2710475 ◽

2016 ◽

Cited By ~ 1

Author(s):

Ignacio Esponda ◽

Demian Pouzo

Keyword(s):

Bounded Rationality ◽

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision

Download Full-text

A Vector Minimum Superharmonic Approach to Solving Infinite-Horizon Discounted Markov Decision Processes

Journal of the Operational Research Society ◽

10.1038/sj/jors/0431109 ◽

1992 ◽

Vol 43 (11) ◽

pp. 1095-1102

Author(s):

D J White

Keyword(s):

Markov Decision Processes ◽

Infinite Horizon ◽

Decision Processes ◽

Markov Decision

Download Full-text

A Convex Programming Approach for Discrete-Time Markov Decision Processes under the Expected Total Reward Criterion

SIAM Journal on Control and Optimization ◽

10.1137/19m1255811 ◽

2020 ◽

Vol 58 (4) ◽

pp. 2535-2566

Author(s):

François Dufour ◽

Alexandre Genadot

Keyword(s):

Convex Programming ◽

Markov Decision Processes ◽

Discrete Time ◽

Decision Processes ◽

Programming Approach ◽

Total Reward ◽

Markov Decision ◽

Reward Criterion

Download Full-text

Extreme-point solutions in Markov decision processes

Journal of Applied Probability ◽

10.1017/s002190020002413x ◽

1983 ◽

Vol 20 (04) ◽

pp. 835-842

Author(s):

David Assaf

Keyword(s):

Convex Function ◽

Extreme Point ◽

Markov Decision Processes ◽

Convex Functions ◽

Sufficient Conditions ◽

Decision Processes ◽

Markov Decision ◽

Full Solution

The paper presents sufficient conditions for certain functions to be convex. Functions of this type often appear in Markov decision processes, where their maximum is the solution of the problem. Since a convex function takes its maximum at an extreme point, the conditions may greatly simplify a problem. In some cases a full solution may be obtained after the reduction is made. Some illustrative examples are discussed.

Download Full-text