Reversible Markov Decision Processes with an Average-Reward Criterion

2013 ◽  
Vol 51 (1) ◽  
pp. 402-418
Author(s):  
Randy Cogill ◽  
Cheng Peng
1982 ◽  
Vol 19 (2) ◽  
pp. 301-309 ◽  
Author(s):  
Zvi Rosberg

A semi-Markov decision process, with a denumerable multidimensional state space, is considered. At any given state only a finite number of actions can be taken to control the process. The immediate reward earned in one transition period is merely assumed to be bounded by a polynomial and a bound is imposed on a weighted moment of the next state reached in one transition. It is shown that under an ergodicity assumption there is a stationary optimal policy for the long-run average reward criterion. A queueing network scheduling problem, for which previous criteria are inapplicable, is given as an application.


2000 ◽  
Vol 14 (4) ◽  
pp. 533-548
Author(s):  
Kazuyoshi Wakuta

We study the multichain case of a vector-valued Markov decision process with average reward criterion. We characterize optimal deterministic stationary policies via systems of linear inequalities and discuss a policy iteration algorithm for finding all optimal deterministic stationary policies.


1982 ◽  
Vol 19 (02) ◽  
pp. 301-309
Author(s):  
Zvi Rosberg

A semi-Markov decision process, with a denumerable multidimensional state space, is considered. At any given state only a finite number of actions can be taken to control the process. The immediate reward earned in one transition period is merely assumed to be bounded by a polynomial and a bound is imposed on a weighted moment of the next state reached in one transition. It is shown that under an ergodicity assumption there is a stationary optimal policy for the long-run average reward criterion. A queueing network scheduling problem, for which previous criteria are inapplicable, is given as an application.


Sign in / Sign up

Export Citation Format

Share Document