New discount and average optimality conditions for continuous-time Markov decision processes

This paper deals with continuous-time Markov decision processes in Polish spaces, under the discounted and average cost criteria. All underlying Markov processes are determined by given transition rates which are allowed to be unbounded, and the costs are assumed to be bounded below. By introducing an occupation measure of a randomized Markov policy and analyzing properties of occupation measures, we first show that the family of all randomized stationary policies is ‘sufficient’ within the class of all randomized Markov policies. Then, under the semicontinuity and compactness conditions, we prove the existence of a discounted cost optimal stationary policy by providing a value iteration technique. Moreover, by developing a new average cost, minimum nonnegative solution method, we prove the existence of an average cost optimal stationary policy under some reasonably mild conditions. Finally, we use some examples to illustrate applications of our results. Except that the costs are assumed to be bounded below, the conditions for the existence of discounted cost (or average cost) optimal policies are much weaker than those in the previous literature, and the minimum nonnegative solution approach is new.

Download Full-text

Average Cost Semi-Markov Decision Processes and the Control of Queueing Systems

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964800001121 ◽

1989 ◽

Vol 3 (2) ◽

pp. 247-272 ◽

Cited By ~ 47

Author(s):

Linn I. Sennott

Keyword(s):

Markov Decision Processes ◽

Average Cost ◽

Queueing Systems ◽

Decision Processes ◽

Single Server ◽

Stationary Policy ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Poisson Arrivals ◽

Action Spaces

Semi-Markov decision processes underlie the control of many queueing systems. In this paper, we deal with infinite state semi-Markov decision processes with nonnegative, unbounded costs and finite action sets. Axioms for the existence of an expected average cost optimal stationary policy are presented. These conditions generalize the work in Sennott [22] for Markov decision processes. Verifiable conditions for the axioms to hold are obtained. The theory is applied to control of the M/G/l queue with variable service parameter, with on-off server, and with batch processing, and to control of the G/M/m queue with variable arrival parameter and customer rejection. It is applied to a timesharing network of queues with a single server and finally to optimal routing of Poisson arrivals to parallel exponential servers. The final section extends the existence result to compact action spaces.

Download Full-text

Constrained Optimization for Average Cost Continuous-Time Markov Decision Processes

IEEE Transactions on Automatic Control ◽

10.1109/tac.2007.899040 ◽

2007 ◽

Vol 52 (6) ◽

pp. 1139-1143 ◽

Cited By ~ 20

Author(s):

Xianping Guo

Keyword(s):

Constrained Optimization ◽

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Decision Processes ◽

Markov Decision

Download Full-text

Average optimality for Markov decision processes in borel spaces: a new condition and approach

Journal of Applied Probability ◽

10.1017/s0021900200001662 ◽

2006 ◽

Vol 43 (02) ◽

pp. 318-334

Author(s):

Xianping Guo ◽

Quanxin Zhu

Keyword(s):

Markov Decision Processes ◽

Discrete Time ◽

Existence Of Solutions ◽

Sufficient Conditions ◽

Decision Processes ◽

Stationary Policy ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Optimality Inequality ◽

Action Spaces

In this paper we study discrete-time Markov decision processes with Borel state and action spaces. The criterion is to minimize average expected costs, and the costs may have neither upper nor lower bounds. We first provide two average optimality inequalities of opposing directions and give conditions for the existence of solutions to them. Then, using the two inequalities, we ensure the existence of an average optimal (deterministic) stationary policy under additional continuity-compactness assumptions. Our conditions are slightly weaker than those in the previous literature. Also, some new sufficient conditions for the existence of an average optimal stationary policy are imposed on the primitive data of the model. Moreover, our approach is slightly different from the well-known ‘optimality inequality approach’ widely used in Markov decision processes. Finally, we illustrate our results in two examples.

Download Full-text

A note on optimality conditions for continuous-time Markov decision processes with average cost criterion

IEEE Transactions on Automatic Control ◽

10.1109/9.975505 ◽

2001 ◽

Vol 46 (12) ◽

pp. 1984-1989 ◽

Cited By ~ 29

Author(s):

Xianping Guo ◽

Ke Liu

Keyword(s):

Optimality Conditions ◽

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Decision Processes ◽

Average Cost Criterion ◽

Cost Criterion ◽

Markov Decision

Download Full-text

Continuous-time Markov decision processes under the risk-sensitive average cost criterion

Operations Research Letters ◽

10.1016/j.orl.2016.04.010 ◽

2016 ◽

Vol 44 (4) ◽

pp. 457-462 ◽

Cited By ~ 4

Author(s):

Qingda Wei ◽

Xian Chen

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Decision Processes ◽

Average Cost Criterion ◽

Cost Criterion ◽

Risk Sensitive ◽

Markov Decision

Download Full-text

Average cost criterion induced by the regular utility function for continuous-time Markov decision processes

Discrete Event Dynamic Systems ◽

10.1007/s10626-017-0237-x ◽

2017 ◽

Vol 27 (3) ◽

pp. 501-524 ◽

Cited By ~ 4

Author(s):

Qingda Wei ◽

Xian Chen

Keyword(s):

Utility Function ◽

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Decision Processes ◽

Average Cost Criterion ◽

Cost Criterion ◽

Markov Decision

Download Full-text

Average optimality for Markov decision processes in borel spaces: a new condition and approach

Journal of Applied Probability ◽

10.1239/jap/1152413725 ◽

2006 ◽

Vol 43 (2) ◽

pp. 318-334 ◽

Cited By ~ 21

Author(s):

Xianping Guo ◽

Quanxin Zhu

Keyword(s):

Markov Decision Processes ◽

Discrete Time ◽

Existence Of Solutions ◽

Sufficient Conditions ◽

Decision Processes ◽

Stationary Policy ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Optimality Inequality ◽

Action Spaces

In this paper we study discrete-time Markov decision processes with Borel state and action spaces. The criterion is to minimize average expected costs, and the costs may have neither upper nor lower bounds. We first provide two average optimality inequalities of opposing directions and give conditions for the existence of solutions to them. Then, using the two inequalities, we ensure the existence of an average optimal (deterministic) stationary policy under additional continuity-compactness assumptions. Our conditions are slightly weaker than those in the previous literature. Also, some new sufficient conditions for the existence of an average optimal stationary policy are imposed on the primitive data of the model. Moreover, our approach is slightly different from the well-known ‘optimality inequality approach’ widely used in Markov decision processes. Finally, we illustrate our results in two examples.

Download Full-text

A semimartingale characterization of average optimal stationary policies for Markov decision processes

Journal of Applied Mathematics and Stochastic Analysis ◽

10.1155/jamsa/2006/81593 ◽

2006 ◽

Vol 2006 ◽

pp. 1-8 ◽

Cited By ~ 1

Author(s):

Quanxin Zhu ◽

Xianping Guo

Keyword(s):

Markov Decision Processes ◽

Sufficient Conditions ◽

Decision Processes ◽

Stationary Policy ◽

Optimal Policies ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Necessary And Sufficient ◽

Action Spaces

This paper deals with discrete-time Markov decision processes with Borel state and action spaces. The criterion to be minimized is the average expected costs, and the costs may have neither upper nor lower bounds. In our former paper (to appear in Journal of Applied Probability), weaker conditions are proposed to ensure the existence of average optimal stationary policies. In this paper, we further study some properties of optimal policies. Under these weaker conditions, we not only obtain two necessary and sufficient conditions for optimal policies, but also give a semimartingale characterization of an average optimal stationary policy.

Download Full-text

Discounted Markov Decision Processes with Constrained Costs: the decomposition approach

E3S Web of Conferences ◽

10.1051/e3sconf/202122901047 ◽

2021 ◽

Vol 229 ◽

pp. 01047

Author(s):

Abdellatif Semmouri ◽

Mostafa Jourhmane ◽

Bahaa Eddine Elbaghazaoui

Keyword(s):

Markov Decision Processes ◽

Mobile Networks ◽

Decision Processes ◽

Stationary Policy ◽

Decomposition Approach ◽

Finite State ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Decision Epoch ◽

Discounted Criterion

In this paper we consider a constrained optimization of discrete time Markov Decision Processes (MDPs) with finite state and action spaces, which accumulate both a reward and costs at each decision epoch. We will study the problem of finding a policy that maximizes the expected total discounted reward subject to the constraints that the expected total discounted costs are not greater than given values. Thus, we will investigate the decomposition method of the state space into the strongly communicating classes for computing an optimal or a nearly optimal stationary policy. The discounted criterion has many applications in several areas such that the Forest Management, the Management of Energy Consumption, the finance, the Communication System (Mobile Networks) and the artificial intelligence.

Download Full-text