Denumerable controlled Markov chains with average reward criterion: sample path optimality

Author(s):  
R. Cavazos-Cadena ◽  
E. Fernandez-Gaucheraud
2015 ◽  
Vol 52 (2) ◽  
pp. 419-440
Author(s):  
Rolando Cavazos-Cadena ◽  
Raúl Montes-De-Oca ◽  
Karel Sladký

This paper concerns discrete-time Markov decision chains with denumerable state and compact action sets. Besides standard continuity requirements, the main assumption on the model is that it admits a Lyapunov function ℓ. In this context the average reward criterion is analyzed from the sample-path point of view. The main conclusion is that if the expected average reward associated to ℓ2 is finite under any policy then a stationary policy obtained from the optimality equation in the standard way is sample-path average optimal in a strong sense.


2015 ◽  
Vol 52 (02) ◽  
pp. 419-440 ◽  
Author(s):  
Rolando Cavazos-Cadena ◽  
Raúl Montes-De-Oca ◽  
Karel Sladký

This paper concerns discrete-time Markov decision chains with denumerable state and compact action sets. Besides standard continuity requirements, the main assumption on the model is that it admits a Lyapunov function ℓ. In this context the average reward criterion is analyzed from the sample-path point of view. The main conclusion is that if the expected average reward associated to ℓ2is finite under any policy then a stationary policy obtained from the optimality equation in the standard way is sample-path average optimal in a strong sense.


1999 ◽  
Vol 30 (7-8) ◽  
pp. 7-20
Author(s):  
M. Kurano ◽  
M. Yasuda ◽  
J.-I. Nakagami ◽  
Y. Yoshida

Sign in / Sign up

Export Citation Format

Share Document