Average optimal policies in Markov decision drift processes with applications to a queueing and a replacement model

Arie Hordijk; Frank A. Van Der Duyn Schouten

doi:10.2307/1426437

Average optimal policies in Markov decision drift processes with applications to a queueing and a replacement model

Advances in Applied Probability ◽

10.1017/s0001867800021182 ◽

1983 ◽

Vol 15 (02) ◽

pp. 274-303 ◽

Cited By ~ 3

Author(s):

Arie Hordijk ◽

Frank A. Van Der Duyn Schouten

Keyword(s):

Markov Decision Processes ◽

Optimal Policy ◽

Continuous Time ◽

Sufficient Conditions ◽

Decision Processes ◽

Time Parameter ◽

Queueing Model ◽

Replacement Model ◽

Optimal Policies ◽

Markov Decision

Recently the authors introduced the concept of Markov decision drift processes. A Markov decision drift process can be seen as a straightforward generalization of a Markov decision process with continuous time parameter. In this paper we investigate the existence of stationary average optimal policies for Markov decision drift processes. Using a well-known Abelian theorem we derive sufficient conditions, which guarantee that a ‘limit point' of a sequence of discounted optimal policies with the discounting factor approaching 1 is an average optimal policy. An alternative set of sufficient conditions is obtained for the case in which the discounted optimal policies generate regenerative stochastic processes. The latter set of conditions is easier to verify in several applications. The results of this paper are also applicable to Markov decision processes with discrete or continuous time parameter and to semi-Markov decision processes. In this sense they generalize some well-known results for Markov decision processes with finite or compact action space. Applications to an M/M/1 queueing model and a maintenance replacement model are given. It is shown that under certain conditions on the model parameters the average optimal policy for the M/M/1 queueing model is monotone non-decreasing (as a function of the number of waiting customers) with respect to the service intensity and monotone non-increasing with respect to the arrival intensity. For the maintenance replacement model we prove the average optimality of a bang-bang type policy. Special attention is paid to the computation of the optimal control parameters.

Download Full-text

Impulsive Control for Continuous-Time Markov Decision Processes

Advances in Applied Probability ◽

10.1239/aap/1427814583 ◽

2015 ◽

Vol 47 (1) ◽

pp. 106-127 ◽

Cited By ~ 6

Author(s):

François Dufour ◽

Alexei B. Piunovskiy

Keyword(s):

Optimal Control ◽

Control Problem ◽

Markov Decision Processes ◽

Control Strategy ◽

Continuous Time ◽

Sufficient Conditions ◽

Decision Processes ◽

Optimal Control Strategy ◽

Optimality Equation ◽

Markov Decision

In this paper our objective is to study continuous-time Markov decision processes on a general Borel state space with both impulsive and continuous controls for the infinite time horizon discounted cost. The continuous-time controlled process is shown to be nonexplosive under appropriate hypotheses. The so-called Bellman equation associated to this control problem is studied. Sufficient conditions ensuring the existence and the uniqueness of a bounded measurable solution to this optimality equation are provided. Moreover, it is shown that the value function of the optimization problem under consideration satisfies this optimality equation. Sufficient conditions are also presented to ensure on the one hand the existence of an optimal control strategy, and on the other hand the existence of a ε-optimal control strategy. The decomposition of the state space into two disjoint subsets is exhibited where, roughly speaking, one should apply a gradual action or an impulsive action correspondingly to obtain an optimal or ε-optimal strategy. An interesting consequence of our previous results is as follows: the set of strategies that allow interventions at time t = 0 and only immediately after natural jumps is a sufficient set for the control problem under consideration.

Download Full-text

Markov decision processes with continuous time parameter

European Journal of Operational Research ◽

10.1016/0377-2217(84)90298-4 ◽

1984 ◽

Vol 16 (3) ◽

pp. 392-393

Author(s):

M. Schäl

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Decision Processes ◽

Time Parameter ◽

Markov Decision

Download Full-text

Impulsive Control for Continuous-Time Markov Decision Processes

Advances in Applied Probability ◽

10.1017/s0001867800007722 ◽

2015 ◽

Vol 47 (01) ◽

pp. 106-127 ◽

Cited By ~ 2

Author(s):

François Dufour ◽

Alexei B. Piunovskiy

Keyword(s):

Optimal Control ◽

Control Problem ◽

Markov Decision Processes ◽

Control Strategy ◽

Continuous Time ◽

Sufficient Conditions ◽

Decision Processes ◽

Optimal Control Strategy ◽

Optimality Equation ◽

Markov Decision

In this paper our objective is to study continuous-time Markov decision processes on a general Borel state space with both impulsive and continuous controls for the infinite time horizon discounted cost. The continuous-time controlled process is shown to be nonexplosive under appropriate hypotheses. The so-called Bellman equation associated to this control problem is studied. Sufficient conditions ensuring the existence and the uniqueness of a bounded measurable solution to this optimality equation are provided. Moreover, it is shown that the value function of the optimization problem under consideration satisfies this optimality equation. Sufficient conditions are also presented to ensure on the one hand the existence of an optimal control strategy, and on the other hand the existence of a ε-optimal control strategy. The decomposition of the state space into two disjoint subsets is exhibited where, roughly speaking, one should apply a gradual action or an impulsive action correspondingly to obtain an optimal or ε-optimal strategy. An interesting consequence of our previous results is as follows: the set of strategies that allow interventions at time t = 0 and only immediately after natural jumps is a sufficient set for the control problem under consideration.

Download Full-text

Absorbing Continuous-Time Markov Decision Processes with Total Cost Criteria

Advances in Applied Probability ◽

10.1239/aap/1370870127 ◽

2013 ◽

Vol 45 (2) ◽

pp. 490-519 ◽

Cited By ~ 4

Author(s):

Xianping Guo ◽

Mantas Vykertas ◽

Yi Zhang

Keyword(s):

Markov Decision Processes ◽

Optimal Policy ◽

Continuous Time ◽

Strong Duality ◽

Performance Measure ◽

Decision Processes ◽

Linear Programs ◽

Constrained Problems ◽

Markov Decision ◽

Unconstrained Problem

In this paper we study absorbing continuous-time Markov decision processes in Polish state spaces with unbounded transition and cost rates, and history-dependent policies. The performance measure is the expected total undiscounted costs. For the unconstrained problem, we show the existence of a deterministic stationary optimal policy, whereas, for the constrained problems with N constraints, we show the existence of a mixed stationary optimal policy, where the mixture is over no more than N+1 deterministic stationary policies. Furthermore, the strong duality result is obtained for the associated linear programs.

Download Full-text

Markov Decision Processes with Continuous Time Parameter

Journal of the Operational Research Society ◽

10.2307/2581180 ◽

1984 ◽

Vol 35 (4) ◽

pp. 366

Author(s):

Sean Collins ◽

F. A. van der Duyn Schouten

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Decision Processes ◽

Time Parameter ◽

Markov Decision

Download Full-text

Markov Decision Processes with Continuous Time Parameter

Journal of the Operational Research Society ◽

10.1057/jors.1984.74 ◽

1984 ◽

Vol 35 (4) ◽

pp. 366-367

Author(s):

Sean Collins

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Decision Processes ◽

Time Parameter ◽

Markov Decision

Download Full-text

Adaptive control of M/M/1 queues—continuous-time Markov decision process approach

Journal of Applied Probability ◽

10.1017/s0021900200023512 ◽

1983 ◽

Vol 20 (02) ◽

pp. 368-379

Author(s):

Lam Yeh ◽

L. C. Thomas

Keyword(s):

Adaptive Control ◽

Markov Decision Process ◽

Markov Decision Processes ◽

Optimal Policy ◽

Continuous Time ◽

Decision Process ◽

Process Approach ◽

Decision Processes ◽

Markov Decision ◽

Discounted Costs

By considering continuous-time Markov decision processes where decisions can be made at any time, we show in the case of M/M/1 queues with discounted costs that there exists a monotone optimal policy among all the regular policies.

Download Full-text

New sufficient conditions for average optimality in continuous-time Markov decision processes

Mathematical Methods of Operations Research ◽

10.1007/s00186-010-0307-4 ◽

2010 ◽

Vol 72 (1) ◽

pp. 75-94 ◽

Cited By ~ 4

Author(s):

Liuer Ye ◽

Xianping Guo

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Sufficient Conditions ◽

Decision Processes ◽

Markov Decision

Download Full-text

Adaptive control of M/M/1 queues—continuous-time Markov decision process approach

Journal of Applied Probability ◽

10.2307/3213809 ◽

1983 ◽

Vol 20 (2) ◽

pp. 368-379 ◽

Cited By ~ 6

Author(s):

Lam Yeh ◽

L. C. Thomas

Keyword(s):

Adaptive Control ◽

Markov Decision Process ◽

Markov Decision Processes ◽

Optimal Policy ◽

Continuous Time ◽

Decision Process ◽

Process Approach ◽

Decision Processes ◽

Markov Decision ◽

Discounted Costs

By considering continuous-time Markov decision processes where decisions can be made at any time, we show in the case of M/M/1 queues with discounted costs that there exists a monotone optimal policy among all the regular policies.

Download Full-text