Centralized Optimization for Dec-POMDPs Under the Expected Average Reward Criterion

2017 ◽  
Vol 62 (11) ◽  
pp. 6032-6038 ◽  
Author(s):  
Xiaofeng Jiang ◽  
Xiaodong Wang ◽  
Hongsheng Xi ◽  
Falin Liu
2015 ◽  
Vol 52 (2) ◽  
pp. 419-440
Author(s):  
Rolando Cavazos-Cadena ◽  
Raúl Montes-De-Oca ◽  
Karel Sladký

This paper concerns discrete-time Markov decision chains with denumerable state and compact action sets. Besides standard continuity requirements, the main assumption on the model is that it admits a Lyapunov function ℓ. In this context the average reward criterion is analyzed from the sample-path point of view. The main conclusion is that if the expected average reward associated to ℓ2 is finite under any policy then a stationary policy obtained from the optimality equation in the standard way is sample-path average optimal in a strong sense.


1999 ◽  
Vol 30 (7-8) ◽  
pp. 7-20
Author(s):  
M. Kurano ◽  
M. Yasuda ◽  
J.-I. Nakagami ◽  
Y. Yoshida

Sign in / Sign up

Export Citation Format

Share Document