Markov Decision Processes with Discounted Cost: The action elimination procedures

This paper deals with continuous-time Markov decision processes in Polish spaces, under the discounted and average cost criteria. All underlying Markov processes are determined by given transition rates which are allowed to be unbounded, and the costs are assumed to be bounded below. By introducing an occupation measure of a randomized Markov policy and analyzing properties of occupation measures, we first show that the family of all randomized stationary policies is ‘sufficient’ within the class of all randomized Markov policies. Then, under the semicontinuity and compactness conditions, we prove the existence of a discounted cost optimal stationary policy by providing a value iteration technique. Moreover, by developing a new average cost, minimum nonnegative solution method, we prove the existence of an average cost optimal stationary policy under some reasonably mild conditions. Finally, we use some examples to illustrate applications of our results. Except that the costs are assumed to be bounded below, the conditions for the existence of discounted cost (or average cost) optimal policies are much weaker than those in the previous literature, and the minimum nonnegative solution approach is new.

Download Full-text

New discount and average optimality conditions for continuous-time Markov decision processes

Advances in Applied Probability ◽

10.1239/aap/1293113146 ◽

2010 ◽

Vol 42 (4) ◽

pp. 953-985 ◽

Cited By ~ 9

Author(s):

Xianping Guo ◽

Liuer Ye

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Nonnegative Solution ◽

Decision Processes ◽

Stationary Policy ◽

Discounted Cost ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Bounded Below

This paper deals with continuous-time Markov decision processes in Polish spaces, under the discounted and average cost criteria. All underlying Markov processes are determined by given transition rates which are allowed to be unbounded, and the costs are assumed to be bounded below. By introducing an occupation measure of a randomized Markov policy and analyzing properties of occupation measures, we first show that the family of all randomized stationary policies is ‘sufficient’ within the class of all randomized Markov policies. Then, under the semicontinuity and compactness conditions, we prove the existence of a discounted cost optimal stationary policy by providing a value iteration technique. Moreover, by developing a new average cost, minimum nonnegative solution method, we prove the existence of an average cost optimal stationary policy under some reasonably mild conditions. Finally, we use some examples to illustrate applications of our results. Except that the costs are assumed to be bounded below, the conditions for the existence of discounted cost (or average cost) optimal policies are much weaker than those in the previous literature, and the minimum nonnegative solution approach is new.

Download Full-text

Discounted Cost Markov Decision Processes with a Constraint

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964800005131 ◽

1998 ◽

Vol 12 (2) ◽

pp. 177-187 ◽

Cited By ~ 3

Author(s):

Kazuyoshi Wakuta

Keyword(s):

Markov Decision Process ◽

Markov Decision Processes ◽

Decision Process ◽

Decision Processes ◽

Discounted Cost ◽

Markov Decision ◽

Vector Valued

We consider a discounted cost Markov decision process with a constraint. Relating this to a vector-valued Markov decision process, we prove that there exists a constrained optimal randomized semistationary policy if there exists at least one policy satisfying a constraint. Moreover, we present an algorithm by which we can find the constrained optimal randomized semistationary policy, or we can discover that there exist no policies satisfying a given constraint.

Download Full-text

Customizing exponential semi-Markov decision processes under the discounted cost criterion

European Journal of Operational Research ◽

10.1016/j.ejor.2017.09.016 ◽

2018 ◽

Vol 266 (1) ◽

pp. 168-178

Author(s):

Bora Çekyay

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Discounted Cost ◽

Cost Criterion ◽

Markov Decision ◽

Discounted Cost Criterion

Download Full-text

Approximation of Infinite Horizon Discounted Cost Markov Decision Processes

Optimization, Control, and Applications of Stochastic Systems ◽

10.1007/978-0-8176-8337-5_4 ◽

2012 ◽

pp. 59-76 ◽

Cited By ~ 1

Author(s):

François Dufour ◽

Tomás Prieto-Rumeau

Keyword(s):

Markov Decision Processes ◽

Infinite Horizon ◽

Decision Processes ◽

Discounted Cost ◽

Markov Decision

Download Full-text

PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622011004762 ◽

2011 ◽

Vol 10 (06) ◽

pp. 1175-1197 ◽

Cited By ~ 1

Author(s):

JOHN GOULIONIS ◽

D. STENGOS

Keyword(s):

Markov Decision Processes ◽

Piecewise Linear ◽

Linear Equations ◽

Infinite Horizon ◽

Decision Processes ◽

Value Functions ◽

Discounted Cost ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

This paper treats the infinite horizon discounted cost control problem for partially observable Markov decision processes. Sondik studied the class of finitely transient policies and showed that their value functions over an infinite time horizon are piecewise linear (p.w.l) and can be computed exactly by solving a system of linear equations. However, the condition for finite transience is stronger than is needed to ensure p.w.l. value functions. In this paper, we introduce alternatively the class of periodic policies whose value functions turn out to be also p.w.l. Moreover, we examine a more general condition than finite transience and periodicity that ensures p.w.l. value functions. We implement these ideas in a replacement problem under Markovian deterioration, investigate for periodic policies and give numerical examples.

Download Full-text