Optimization of Deep Reinforcement Learning with Hybrid Multi-Task Learning

Author(s):  
Nelson Vithayathil Varghese ◽  
Qusay H. Mahmoud
10.29007/g7bg ◽  
2019 ◽  
Author(s):  
João Ribeiro ◽  
Francisco Melo ◽  
João Dias

In this paper we investigate two hypothesis regarding the use of deep reinforcement learning in multiple tasks. The first hypothesis is driven by the question of whether a deep reinforcement learning algorithm, trained on two similar tasks, is able to outperform two single-task, individually trained algorithms, by more efficiently learning a new, similar task, that none of the three algorithms has encountered before. The second hypothesis is driven by the question of whether the same multi-task deep RL algorithm, trained on two similar tasks and augmented with elastic weight consolidation (EWC), is able to retain similar performance on the new task, as a similar algorithm without EWC, whilst being able to overcome catastrophic forgetting in the two previous tasks. We show that a multi-task Asynchronous Advantage Actor-Critic (GA3C) algorithm, trained on Space Invaders and Demon Attack, is in fact able to outperform two single-tasks GA3C versions, trained individually for each single-task, when evaluated on a new, third task—namely, Phoenix. We also show that, when training two trained multi-task GA3C algorithms on the third task, if one is augmented with EWC, it is not only able to achieve similar performance on the new task, but also capable of overcoming a substantial amount of catastrophic forgetting on the two previous tasks.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Jia Shen ◽  
Pan-Tong Yao ◽  
Shaoyu Ge ◽  
Qiaojie Xiong

AbstractAuditory-cued goal-oriented behaviors requires the participation of cortical and subcortical brain areas, but how neural circuits associate sensory-based decisions with goal locations through learning remains poorly understood. The hippocampus is critical for spatial coding, suggesting its possible involvement in transforming sensory inputs to the goal-oriented decisions. Here, we developed an auditory discrimination task in which rats learned to navigate to goal locations based on the frequencies of auditory stimuli. Using in vivo calcium imaging in freely behaving rats over the course of learning, we found that dentate granule cells became more active, spatially tuned, and responsive to task-related variables as learning progressed. Furthermore, only after task learning, the activity of dentate granule cell ensembles represented the navigation path and predicts auditory decisions as early as when rats began to approach the goals. Finally, chemogenetic silencing of dentate gyrus suppressed task learning. Our results demonstrate that dentate granule cells gain task-relevant firing pattern through reinforcement learning and could be a potential link of sensory decisions to spatial navigation.


Sign in / Sign up

Export Citation Format

Share Document