A Sublinear-Regret Reinforcement Learning Algorithm on Constrained Markov Decision Processes with reset action

Author(s):  
Takashi Watanabe ◽  
Takashi Sakuragawa
Author(s):  
Angelo Encapera ◽  
Abhijit Gosavi

Artificial intelligence techniques can play a significant role in solving problems encountered in the domain of Total Productive Maintenance (TPM). This paper considers a new reinforcement learning algorithm called iSMART, which can solve semi-Markov decision processes underlying control problems related to TPM. The algorithm uses a constant exploration rate, unlike its precursor R-SMART, which required exploration decay. Numerical experiments conducted here show encouraging behavior with the new algorithm.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 165007-165017 ◽  
Author(s):  
Yangyang Ge ◽  
Fei Zhu ◽  
Xinghong Ling ◽  
Quan Liu

Sign in / Sign up

Export Citation Format

Share Document