Safe State Abstraction and Reusable Continuing Subtasks in Hierarchical Reinforcement Learning

Reinforcement learning (RL) deals with the problem of an agent that has to learn how to behave to maximize its utility by its interactions with an environment (Sutton & Barto, 1998; Kaelbling, Littman & Moore, 1996). Reinforcement learning problems are usually formalized as Markov Decision Processes (MDP), which consist of a finite set of states and a finite number of possible actions that the agent can perform. At any given point in time, the agent is in a certain state and picks an action. It can then observe the new state this action leads to, and receives a reward signal. The goal of the agent is to maximize its long-term reward. In this standard formalization, no particular structure or relationship between states is assumed. However, learning in environments with extremely large state spaces is infeasible without some form of generalization. Exploiting the underlying structure of a problem can effect generalization and has long been recognized as an important aspect in representing sequential decision tasks (Boutilier et al., 1999). Hierarchical Reinforcement Learning is the subfield of RL that deals with the discovery and/or exploitation of this underlying structure. Two main ideas come into play in hierarchical RL. The first one is to break a task into a hierarchy of smaller subtasks, each of which can be learned faster and easier than the whole problem. Subtasks can also be performed multiple times in the course of achieving the larger task, reusing accumulated knowledge and skills. The second idea is to use state abstraction within subtasks: not every task needs to be concerned with every aspect of the state space, so some states can actually be abstracted away and treated as the same for the purpose of the given subtask.

Download Full-text

The Actor-Judge Method: safe state exploration for Hierarchical Reinforcement Learning Controllers

2018 AIAA Information Systems-AIAA Infotech @ Aerospace ◽

10.2514/6.2018-1634 ◽

2018 ◽

Author(s):

Stephen Verbist ◽

Tommaso Mannucci ◽

Erik-Jan Van Kampen

Keyword(s):

Reinforcement Learning ◽

Hierarchical Reinforcement Learning ◽

Safe State

Download Full-text

Improving Student-System Interaction Through Data-driven Explanations of Hierarchical Reinforcement Learning Induced Pedagogical Policies

Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization ◽

10.1145/3340631.3394848 ◽

2020 ◽

Author(s):

Guojing Zhou ◽

Xi Yang ◽

Hamoon Azizsoltani ◽

Tiffany Barnes ◽

Min Chi

Keyword(s):

Reinforcement Learning ◽

Data Driven ◽

Hierarchical Reinforcement Learning

Download Full-text

Automatic Hierarchical Reinforcement Learning for Reusing Service Process Fragments

IEEE Access ◽

10.1109/access.2021.3054852 ◽

2021 ◽

Vol 9 ◽

pp. 20746-20759

Author(s):

Rong Yang ◽

Bing Li ◽

Zhengli Liu

Keyword(s):

Reinforcement Learning ◽

Service Process ◽

Hierarchical Reinforcement Learning ◽

Process Fragments

Download Full-text

Hierarchical Reinforcement Learning Framework for Secure UAV Communication in the Presence of Multiple UAV Adaptive Eavesdroppers

2020 IEEE 6th International Conference on Computer and Communications (ICCC) ◽

10.1109/iccc51575.2020.9344970 ◽

2020 ◽

Author(s):

Liu Jue ◽

Yang Weiwei

Keyword(s):

Reinforcement Learning ◽

Hierarchical Reinforcement Learning ◽

Learning Framework

Download Full-text

Hierarchical Reinforcement Learning

ACM Computing Surveys ◽

10.1145/3453160 ◽

2021 ◽

Vol 54 (5) ◽

pp. 1-35

Author(s):

Shubham Pateria ◽

Budhitama Subagdja ◽

Ah-hwee Tan ◽

Chai Quek

Keyword(s):

Reinforcement Learning ◽

Future Research ◽

Comprehensive Overview ◽

Open Problems ◽

Practical Applications ◽

Hierarchical Reinforcement Learning ◽

The Past ◽

Agent Learning ◽

Multi Agent ◽

Supplementary Material

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious approaches. A comprehensive overview of this vast landscape is necessary to study HRL in an organized manner. We provide a survey of the diverse HRL approaches concerning the challenges of learning hierarchical policies, subtask discovery, transfer learning, and multi-agent learning using HRL. The survey is presented according to a novel taxonomy of the approaches. Based on the survey, a set of important open problems is proposed to motivate the future research in HRL. Furthermore, we outline a few suitable task domains for evaluating the HRL approaches and a few interesting examples of the practical applications of HRL in the Supplementary Material.

Download Full-text