Author(s):  
Angelo Encapera ◽  
Abhijit Gosavi

Artificial intelligence techniques can play a significant role in solving problems encountered in the domain of Total Productive Maintenance (TPM). This paper considers a new reinforcement learning algorithm called iSMART, which can solve semi-Markov decision processes underlying control problems related to TPM. The algorithm uses a constant exploration rate, unlike its precursor R-SMART, which required exploration decay. Numerical experiments conducted here show encouraging behavior with the new algorithm.


1987 ◽  
Vol 24 (01) ◽  
pp. 270-276
Author(s):  
Masami Kurano

This study is concerned with finite Markov decision processes whose dynamics and reward structure are unknown but the state is observable exactly. We establish a learning algorithm which yields an optimal policy and construct an adaptive policy which is optimal under the average expected reward criterion.


2017 ◽  
Vol 47 (6) ◽  
pp. 1367-1379 ◽  
Author(s):  
Zhen Zhang ◽  
Dongbin Zhao ◽  
Junwei Gao ◽  
Dongqing Wang ◽  
Yujie Dai

Sign in / Sign up

Export Citation Format

Share Document