scholarly journals Online state space generation by a growing self-organizing map and differential learning for reinforcement learning

2020 ◽  
Vol 97 ◽  
pp. 106723
Author(s):  
Akira Notsu ◽  
Koji Yasuda ◽  
Seiki Ubukata ◽  
Katsuhiro Honda
2018 ◽  
Vol 27 (2) ◽  
pp. 111-126 ◽  
Author(s):  
Thommen George Karimpanal ◽  
Roland Bouffanais

The idea of reusing or transferring information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency of a reinforcement learning agent. In this work, we describe a novel approach for reusing previously acquired knowledge by using it to guide the exploration of an agent while it learns new tasks. In order to do so, we employ a variant of the growing self-organizing map algorithm, which is trained using a measure of similarity that is defined directly in the space of the vectorized representations of the value functions. In addition to enabling transfer across tasks, the resulting map is simultaneously used to enable the efficient storage of previously acquired task knowledge in an adaptive and scalable manner. We empirically validate our approach in a simulated navigation environment and also demonstrate its utility through simple experiments using a mobile micro-robotics platform. In addition, we demonstrate the scalability of this approach and analytically examine its relation to the proposed network growth mechanism. Furthermore, we briefly discuss some of the possible improvements and extensions to this approach, as well as its relevance to real-world scenarios in the context of continual learning.


2004 ◽  
Vol 7 (4) ◽  
pp. 193-197 ◽  
Author(s):  
Takeshi Tateyama ◽  
Seiichi Kawata ◽  
Toshiki Oguchi

2014 ◽  
Author(s):  
Daniel de Filgueiras Gomes ◽  
Aluizio Fausto Ribeiro Araujo

2010 ◽  
Vol 2010 ◽  
pp. 1-9 ◽  
Author(s):  
Takashi Kuremoto ◽  
Takahito Komoto ◽  
Kunikazu Kobayashi ◽  
Masanao Obayashi

An improved self-organizing map (SOM), parameterless-growing-SOM (PL-G-SOM), is proposed in this paper. To overcome problems existed in traditional SOM (Kohonen, 1982), kinds of structure-growing-SOMs or parameter-adjusting-SOMs have been invented and usually separately. Here, we combine the idea of growing SOMs (Bauer and Villmann, 1997; Dittenbach et al. 2000) and a parameterless SOM (Berglund and Sitte, 2006) together to be a novel SOM named PL-G-SOM to realize additional learning, optimal neighborhood preservation, and automatic tuning of parameters. The improved SOM is applied to construct a voice instruction learning system for partner robots adopting a simple reinforcement learning algorithm. User's instructions of voices are classified by the PL-G-SOM at first, then robots choose an expected action according to a stochastic policy. The policy is adjusted by the reward/punishment given by the user of the robot. A feeling map is also designed to express learning degrees of voice instructions. Learning and additional learning experiments used instructions in multiple languages including Japanese, English, Chinese, and Malaysian confirmed the effectiveness of our proposed system.


Author(s):  
Akira Notsu ◽  
◽  
Yuichi Hattori ◽  
Seiki Ubukata ◽  
Katsuhiro Honda ◽  
...  

In reinforcement learning, agents can learn appropriate actions for each situation based on the consequences of these actions after interacting with the environment. Reinforcement learning is compatible with self-organizing maps that accomplish unsupervised learning by reacting to impulses and strengthening neurons. Therefore, numerous studies have investigated the topic of reinforcement learning in which agents learn the state space using self-organizing maps. In this study, while we intended to apply these previous studies to transfer the learning and visualization of the human learning process, we introduced self-organizing maps into reinforcement learning and attempted to make their “state and action” learning process visible. We performed numerical experiments with the 2D goal-search problem; our model visualized the learning process of the agent.


Sign in / Sign up

Export Citation Format

Share Document