compact action sets
Recently Published Documents


TOTAL DOCUMENTS

7
(FIVE YEARS 0)

H-INDEX

1
(FIVE YEARS 0)

2016 ◽  
Vol 44 (5) ◽  
pp. 575-577 ◽  
Author(s):  
Xiaoxi Li ◽  
Sylvain Sorin


2015 ◽  
Vol 52 (2) ◽  
pp. 419-440
Author(s):  
Rolando Cavazos-Cadena ◽  
Raúl Montes-De-Oca ◽  
Karel Sladký

This paper concerns discrete-time Markov decision chains with denumerable state and compact action sets. Besides standard continuity requirements, the main assumption on the model is that it admits a Lyapunov function ℓ. In this context the average reward criterion is analyzed from the sample-path point of view. The main conclusion is that if the expected average reward associated to ℓ2 is finite under any policy then a stationary policy obtained from the optimality equation in the standard way is sample-path average optimal in a strong sense.



2015 ◽  
Vol 52 (02) ◽  
pp. 419-440 ◽  
Author(s):  
Rolando Cavazos-Cadena ◽  
Raúl Montes-De-Oca ◽  
Karel Sladký

This paper concerns discrete-time Markov decision chains with denumerable state and compact action sets. Besides standard continuity requirements, the main assumption on the model is that it admits a Lyapunov function ℓ. In this context the average reward criterion is analyzed from the sample-path point of view. The main conclusion is that if the expected average reward associated to ℓ2is finite under any policy then a stationary policy obtained from the optimality equation in the standard way is sample-path average optimal in a strong sense.





2005 ◽  
Vol 42 (4) ◽  
pp. 905-918 ◽  
Author(s):  
Rolando Cavazos-Cadena ◽  
Raúl Montes-De-Oca

This work concerns Markov decision chains with finite state spaces and compact action sets. The performance index is the long-run risk-sensitive average cost criterion, and it is assumed that, under each stationary policy, the state space is a communicating class and that the cost function and the transition law depend continuously on the action. These latter data are not directly available to the decision-maker, but convergent approximations are known or are more easily computed. In this context, the nonstationary value iteration algorithm is used to approximate the solution of the optimality equation, and to obtain a nearly optimal stationary policy.



2005 ◽  
Vol 42 (04) ◽  
pp. 905-918
Author(s):  
Rolando Cavazos-Cadena ◽  
Raúl Montes-De-Oca

This work concerns Markov decision chains with finite state spaces and compact action sets. The performance index is the long-run risk-sensitive average cost criterion, and it is assumed that, under each stationary policy, the state space is a communicating class and that the cost function and the transition law depend continuously on the action. These latter data are not directly available to the decision-maker, but convergent approximations are known or are more easily computed. In this context, the nonstationary value iteration algorithm is used to approximate the solution of the optimality equation, and to obtain a nearly optimal stationary policy.



2000 ◽  
Vol 25 (4) ◽  
pp. 657-666 ◽  
Author(s):  
Rolando Cavazos-Cadena ◽  
Eugene A. Feinberg ◽  
Raúl Montes-de-Oca


Sign in / Sign up

Export Citation Format

Share Document