Q*-based state abstraction and knowledge discovery in reinforcement learning

Reinforcement learning (RL) deals with the problem of an agent that has to learn how to behave to maximize its utility by its interactions with an environment (Sutton & Barto, 1998; Kaelbling, Littman & Moore, 1996). Reinforcement learning problems are usually formalized as Markov Decision Processes (MDP), which consist of a finite set of states and a finite number of possible actions that the agent can perform. At any given point in time, the agent is in a certain state and picks an action. It can then observe the new state this action leads to, and receives a reward signal. The goal of the agent is to maximize its long-term reward. In this standard formalization, no particular structure or relationship between states is assumed. However, learning in environments with extremely large state spaces is infeasible without some form of generalization. Exploiting the underlying structure of a problem can effect generalization and has long been recognized as an important aspect in representing sequential decision tasks (Boutilier et al., 1999). Hierarchical Reinforcement Learning is the subfield of RL that deals with the discovery and/or exploitation of this underlying structure. Two main ideas come into play in hierarchical RL. The first one is to break a task into a hierarchy of smaller subtasks, each of which can be learned faster and easier than the whole problem. Subtasks can also be performed multiple times in the course of achieving the larger task, reusing accumulated knowledge and skills. The second idea is to use state abstraction within subtasks: not every task needs to be concerned with every aspect of the state space, so some states can actually be abstracted away and treated as the same for the purpose of the given subtask.

Download Full-text

Reinforcement learning for Golog programs with first-order state-abstraction

Logic Journal of IGPL ◽

10.1093/jigpal/jzs011 ◽

2012 ◽

Vol 20 (5) ◽

pp. 909-942

Author(s):

D. Beck ◽

G. Lakemeyer

Keyword(s):

Reinforcement Learning ◽

First Order ◽

State Abstraction ◽

Order State

Download Full-text

Regularizing Reinforcement Learning with State Abstraction

2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) ◽

10.1109/iros.2018.8594201 ◽

2018 ◽

Cited By ~ 3

Author(s):

Riad Akrour ◽

Filipe Veiga ◽

Jan Peters ◽

Gerhard Neumann

Keyword(s):

Reinforcement Learning ◽

State Abstraction

Download Full-text

Towards Interpretable Reinforcement Learning with State Abstraction Driven by External Knowledge

IEICE Transactions on Information and Systems ◽

10.1587/transinf.2019edp7170 ◽

2020 ◽

Vol E103.D (10) ◽

pp. 2143-2153

Author(s):

Nicolas BOUGIE ◽

Ryutaro ICHISE

Keyword(s):

Reinforcement Learning ◽

External Knowledge ◽

State Abstraction

Download Full-text

State Abstraction in Reinforcement Learning by Eliminating Useless Dimensions

2014 13th International Conference on Machine Learning and Applications ◽

10.1109/icmla.2014.22 ◽

2014 ◽

Author(s):

Zhao Cheng ◽

Laura E. Ray

Keyword(s):

Reinforcement Learning ◽

State Abstraction

Download Full-text

A Theory of State Abstraction for Reinforcement Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019876 ◽

2019 ◽

Vol 33 ◽

pp. 9876-9877 ◽

Cited By ~ 1

Author(s):

David Abel

Keyword(s):

Reinforcement Learning ◽

Complex Environments ◽

Information Theoretic ◽

Doctoral Research ◽

Optimal Behavior ◽

Develop Theory ◽

Computational Budget ◽

State Abstraction ◽

Effective Decision Making ◽

And Behavior

Reinforcement learning presents a challenging problem: agents must generalize experiences, efficiently explore the world, and learn from feedback that is delayed and often sparse, all while making use of a limited computational budget. Abstraction is essential to all of these endeavors. Through abstraction, agents can form concise models of both their surroundings and behavior, supporting effective decision making in diverse and complex environments. To this end, the goal of my doctoral research is to characterize the role abstraction plays in reinforcement learning, with a focus on state abstraction. I offer three desiderata articulating what it means for a state abstraction to be useful, and introduce classes of state abstractions that provide a partial path toward satisfying these desiderata. Collectively, I develop theory for state abstractions that can 1) preserve near-optimal behavior, 2) be learned and computed efficiently, and 3) can lower the time or data needed to make effective decisions. I close by discussing extensions of these results to an information theoretic paradigm of abstraction, and an extension to hierarchical abstraction that enjoys the same desirable properties.

Download Full-text

Object-Oriented State Abstraction in Reinforcement Learning for Video Games

2019 IEEE Conference on Games (CoG) ◽

10.1109/cig.2019.8848099 ◽

2019 ◽

Author(s):

Yu Chen ◽

Huizhuo Yuan ◽

Yujun Li

Keyword(s):

Reinforcement Learning ◽

Video Games ◽

Object Oriented ◽

State Abstraction

Download Full-text

Research on task decomposition and state abstraction in reinforcement learning

Artificial Intelligence Review ◽

10.1007/s10462-011-9243-9 ◽

2011 ◽

Vol 38 (2) ◽

pp. 119-127 ◽

Cited By ~ 3

Author(s):

Yu Lasheng ◽

Jiang Zhongbin ◽

Liu Kang

Keyword(s):

Reinforcement Learning ◽

Task Decomposition ◽

State Abstraction

Download Full-text

Temporal and state abstractions for efficient learning, transfer and composition in humans

10.1101/2020.02.20.958587 ◽

2020 ◽

Author(s):

Liyu Xia ◽

Anne G. E. Collins

Keyword(s):

Reinforcement Learning ◽

Quantitative Model ◽

Sequential Decision ◽

Temporal Abstraction ◽

Transfer Effects ◽

Levels Of Abstraction ◽

One Step ◽

State Abstraction ◽

Efficient Learning ◽

Reinforcement Learning Models

AbstractHumans use prior knowledge to efficiently solve novel tasks, but how they structure past knowledge to enable such fast generalization is not well understood. We recently proposed that hierarchical state abstraction enabled generalization of simple one-step rules, by inferring context clusters for each rule. However, humans’ daily tasks are often temporally extended, and necessitate more complex multi-step, hierarchically structured strategies. The options framework in hierarchical reinforcement learning provides a theoretical framework for representing such transferable strategies. Options are abstract multi-step policies, assembled from simpler one-step actions or other options, that can represent meaningful reusable strategies as temporal abstractions. We developed a novel sequential decision making protocol to test if humans learn and transfer multi-step options. In a series of four experiments, we found transfer effects at multiple hierarchical levels of abstraction that could not be explained by flat reinforcement learning models or hierarchical models lacking temporal abstraction. We extended the options framework to develop a quantitative model that blends temporal and state abstractions. Our model captures the transfer effects observed in human participants. Our results provide evidence that humans create and compose hierarchical options, and use them to explore in novel contexts, consequently transferring past knowledge and speeding up learning.

Download Full-text