A policy iteration algorithm for Markov decision processes skip-free in one direction
The policy iteration algorithm for average reward Markov decision processes with general state space
1997 ◽
Vol 42
(12)
◽
pp. 1663-1680
◽
2016 ◽
Vol 133
(10)
◽
pp. 28-33
◽
2003 ◽
Vol 17
(2)
◽
pp. 213-234
◽
2015 ◽
Vol 13
(3)
◽
pp. 47-57
◽