On-line EM Algorithm and Reinforcement Learning

Author(s):  
Shin Ishii ◽  
Masa-aki Sato
2001 ◽  
Vol 32 (5) ◽  
pp. 12-20 ◽  
Author(s):  
Junichiro Yoshimoto ◽  
Shin Ishii ◽  
Masa-aki Sato

Author(s):  
Atsushi Wada ◽  
◽  
Keiki Takadama ◽  
◽  

Learning Classifier Systems (LCSs) are rule-based adaptive systems that have both Reinforcement Learning (RL) and rule-discovery mechanisms for effective and practical on-line learning. With the aim of establishing a common theoretical basis between LCSs and RL algorithms to share each field's findings, a detailed analysis was performed to compare the learning processes of these two approaches. Based on our previous work on deriving an equivalence between the Zeroth-level Classifier System (ZCS) and Q-learning with Function Approximation (FA), this paper extends the analysis to the influence of actually applying the conditions for this equivalence. Comparative experiments have revealed interesting implications: (1) ZCS's original parameter, the deduction rate, plays a role in stabilizing the action selection, but (2) from the Reinforcement Learning perspective, such a process inhibits the ability to accurately estimate values for the entire state-action space, thus limiting the performance of ZCS in problems requiring accurate value estimation.


2004 ◽  
Vol 16 (3) ◽  
pp. 491-499 ◽  
Author(s):  
István Szita ◽  
András Lőrincz

There is a growing interest in using Kalman filter models in brain modeling. The question arises whether Kalman filter models can be used on-line not only for estimation but for control. The usual method of optimal control of Kalman filter makes use of off-line backward recursion, which is not satisfactory for this purpose. Here, it is shown that a slight modification of the linear-quadratic-gaussian Kalman filter model allows the on-line estimation of optimal control by using reinforcement learning and overcomes this difficulty. Moreover, the emerging learning rule for value estimation exhibits a Hebbian form, which is weighted by the error of the value estimation.


Sign in / Sign up

Export Citation Format

Share Document