Average Reward Criterion

This paper concerns discrete-time Markov decision chains with denumerable state and compact action sets. Besides standard continuity requirements, the main assumption on the model is that it admits a Lyapunov function ℓ. In this context the average reward criterion is analyzed from the sample-path point of view. The main conclusion is that if the expected average reward associated to ℓ2 is finite under any policy then a stationary policy obtained from the optimality equation in the standard way is sample-path average optimal in a strong sense.

Download Full-text

Optimal switching problem for countable Markov chains: average reward criterion

Mathematical Methods of Operations Research ◽

10.1007/s001860000102 ◽

2001 ◽

Vol 53 (1) ◽

pp. 1-24 ◽

Cited By ~ 4

Author(s):

Alexander Yushkevich

Keyword(s):

Markov Chains ◽

Average Reward ◽

Optimal Switching ◽

Average Reward Criterion ◽

Reward Criterion

Download Full-text

Fuzzy decision processes with an average reward criterion

Mathematical and Computer Modelling ◽

10.1016/s0895-7177(99)00160-0 ◽

1999 ◽

Vol 30 (7-8) ◽

pp. 7-20

Author(s):

M. Kurano ◽

M. Yasuda ◽

J.-I. Nakagami ◽

Y. Yoshida

Keyword(s):

Decision Processes ◽

Average Reward ◽

Fuzzy Decision ◽

Average Reward Criterion ◽

Reward Criterion

Download Full-text

Continuous time Markov decision programming with average reward criterion and unbounded reward rate

Acta Mathematicae Applicatae Sinica English Series ◽

10.1007/bf02080199 ◽

1991 ◽

Vol 7 (1) ◽

pp. 6-16 ◽

Cited By ~ 7

Author(s):

Shaohui Zheng

Keyword(s):

Continuous Time ◽

Average Reward ◽

Reward Rate ◽

Markov Decision ◽

Average Reward Criterion ◽

Reward Criterion

Download Full-text

On the undiscounted tax problem with precedence constraints

Advances in Applied Probability ◽

10.2307/1428167 ◽

1996 ◽

Vol 28 (4) ◽

pp. 1123-1144

Author(s):

K. D. Glazebrook

Keyword(s):

Single Machine ◽

Precedence Constraints ◽

Sensitivity Analyses ◽

Average Reward ◽

State Dependent ◽

Index Policies ◽

Average Reward Criterion ◽

Reward Criterion

A single machine is available to process a collection of jobs J, each of which evolves stochastically under processing. Jobs incur costs while awaiting the machine at a rate which is state dependent and processing must respect a set of precedence constraints Γ. Index policies are optimal in a variety of scenarios. The indices concerned are characterised as values of restart problems with the average reward criterion. This characterisation yields a range of efficient approaches to their computation. Index-based suboptimality bounds are derived for general processing policies. These bounds enable us to develop sensitivity analyses and to evaluate scheduling heuristics.

Download Full-text

Semi-Markov decision processes with polynomial reward

Journal of Applied Probability ◽

10.2307/3213482 ◽

1982 ◽

Vol 19 (2) ◽

pp. 301-309 ◽

Cited By ~ 6

Author(s):

Zvi Rosberg

Keyword(s):

Transition Period ◽

Queueing Network ◽

Decision Processes ◽

Average Reward ◽

Network Scheduling ◽

Long Run ◽

Markov Decision ◽

Average Reward Criterion ◽

Long Run Average Reward ◽

Reward Criterion

A semi-Markov decision process, with a denumerable multidimensional state space, is considered. At any given state only a finite number of actions can be taken to control the process. The immediate reward earned in one transition period is merely assumed to be bounded by a polynomial and a bound is imposed on a weighted moment of the next state reached in one transition. It is shown that under an ergodicity assumption there is a stationary optimal policy for the long-run average reward criterion. A queueing network scheduling problem, for which previous criteria are inapplicable, is given as an application.

Download Full-text