Detecting optimal and non-optimal actions in average-cost Markov decision processes
Keyword(s):
We present two sufficient conditions for detection of optimal and non-optimal actions in (ergodic) average-cost MDPs. They are easily interpreted and can be implemented as detection tests in both policy iteration and linear programming methods. An efficient implementation of a recent new policy iteration scheme is discussed.
1975 ◽
Vol 29
(1)
◽
pp. 1-7
◽
Keyword(s):
1992 ◽
Vol 24
(1-2)
◽
pp. 147-155
◽
Keyword(s):
1994 ◽
Vol 31
(01)
◽
pp. 268-273
◽
2015 ◽
Vol 47
(1)
◽
pp. 106-127
◽