scholarly journals Policy Learning for Time-Bounded Reachability in Continuous-Time Markov Decision Processes via Doubly-Stochastic Gradient Ascent

Author(s):  
Ezio Bartocci ◽  
Luca Bortolussi ◽  
Tomǎš Brázdil ◽  
Dimitrios Milios ◽  
Guido Sanguinetti
2017 ◽  
Vol 116 ◽  
pp. 84-100 ◽  
Author(s):  
Ezio Bartocci ◽  
Luca Bortolussi ◽  
Tomáš Brázdil ◽  
Dimitrios Milios ◽  
Guido Sanguinetti

2002 ◽  
Vol 43 (4) ◽  
pp. 541-557 ◽  
Author(s):  
Xianping Guo ◽  
Weiping Zhu

AbstractIn this paper, we consider denumerable state continuous time Markov decision processes with (possibly unbounded) transition and cost rates under average criterion. We present a set of conditions and prove the existence of both average cost optimal stationary policies and a solution of the average optimality equation under the conditions. The results in this paper are applied to an admission control queue model and controlled birth and death processes.


Sign in / Sign up

Export Citation Format

Share Document