Hierarchical Learning from Demonstrations for Long-Horizon Tasks

Author(s):  
Boyao Li ◽  
Jiayi Li ◽  
Tao Lu ◽  
Yinghao Cai ◽  
Shuo Wang
1990 ◽  
Vol 66 (1) ◽  
pp. 187-193
Author(s):  
William E. Hauck ◽  
J. William Moore ◽  
Leonard Sancilio

2020 ◽  
Author(s):  
Pieter Verbeke ◽  
Kate Ergo ◽  
Esther De Loof ◽  
Tom Verguts

AbstractIn recent years, several hierarchical extensions of well-known learning algorithms have been proposed. For example, when stimulus-action mappings vary across time or context, the brain may learn two or more stimulus-action mappings in separate modules, and additionally (at a hierarchically higher level) learn to appropriately switch between those modules. However, how the brain mechanistically coordinates neural communication to implement such hierarchical learning, remains unknown. Therefore, the current study tests a recent computational model that proposed how midfrontal theta oscillations implement such hierarchical learning via the principle of binding by synchrony (Sync model). More specifically, the Sync model employs bursts at theta frequency to flexibly bind appropriate task modules by synchrony. 64-channel EEG signal was recorded while 27 human subjects (Female: 21, Male: 6) performed a probabilistic reversal learning task. In line with the Sync model, post-feedback theta power showed a linear relationship with negative prediction errors, but not with positive prediction errors. This relationship was especially pronounced for subjects with better behavioral fit (measured via AIC) of the Sync model. Also consistent with Sync model simulations, theta phase-coupling between midfrontal electrodes and temporo-parietal electrodes was stronger after negative feedback. Our data suggest that the brain uses theta power and synchronization for flexibly switching between task rule modules, as is useful for example when multiple stimulus-action mappings must be retained and used.Significance StatementEveryday life requires flexibility in switching between several rules. A key question in understanding this ability is how the brain mechanistically coordinates such switches. The current study tests a recent computational framework (Sync model) that proposed how midfrontal theta oscillations coordinate activity in hierarchically lower task-related areas. In line with predictions of this Sync model, midfrontal theta power was stronger when rule switches were most likely (strong negative prediction error), especially in subjects who obtained a better model fit. Additionally, also theta phase connectivity between midfrontal and task-related areas was increased after negative feedback. Thus, the data provided support for the hypothesis that the brain uses theta power and synchronization for flexibly switching between rules.


Sign in / Sign up

Export Citation Format

Share Document