scholarly journals A Framework for Analysing State-abstraction Methods

2021 ◽  
pp. 103608
Author(s):  
Christer Bäckström ◽  
Peter Jonsson
Keyword(s):  
Author(s):  
Carlos Diuk ◽  
Michael Littman

Reinforcement learning (RL) deals with the problem of an agent that has to learn how to behave to maximize its utility by its interactions with an environment (Sutton & Barto, 1998; Kaelbling, Littman & Moore, 1996). Reinforcement learning problems are usually formalized as Markov Decision Processes (MDP), which consist of a finite set of states and a finite number of possible actions that the agent can perform. At any given point in time, the agent is in a certain state and picks an action. It can then observe the new state this action leads to, and receives a reward signal. The goal of the agent is to maximize its long-term reward. In this standard formalization, no particular structure or relationship between states is assumed. However, learning in environments with extremely large state spaces is infeasible without some form of generalization. Exploiting the underlying structure of a problem can effect generalization and has long been recognized as an important aspect in representing sequential decision tasks (Boutilier et al., 1999). Hierarchical Reinforcement Learning is the subfield of RL that deals with the discovery and/or exploitation of this underlying structure. Two main ideas come into play in hierarchical RL. The first one is to break a task into a hierarchy of smaller subtasks, each of which can be learned faster and easier than the whole problem. Subtasks can also be performed multiple times in the course of achieving the larger task, reusing accumulated knowledge and skills. The second idea is to use state abstraction within subtasks: not every task needs to be concerned with every aspect of the state space, so some states can actually be abstracted away and treated as the same for the purpose of the given subtask.


2008 ◽  
Vol 45 (7-8) ◽  
pp. 479-536 ◽  
Author(s):  
Ferruccio Damiani ◽  
Elena Giachino ◽  
Paola Giannini ◽  
Sophia Drossopoulou
Keyword(s):  

2006 ◽  
Vol 18 (2) ◽  
pp. 159-172 ◽  
Author(s):  
Jefferson Provost ◽  
Benjamin J. Kuipers ◽  
Risto Miikkulainen

2007 ◽  
Vol 30 ◽  
pp. 51-100 ◽  
Author(s):  
V. Bulitko ◽  
N. Sturtevant ◽  
J. Lu ◽  
T. Yau

Real-time heuristic search methods are used by situated agents in applications that require the amount of planning per move to be independent of the problem size. Such agents plan only a few actions at a time in a local search space and avoid getting trapped in local minima by improving their heuristic function over time. We extend a wide class of real-time search algorithms with automatically-built state abstraction and prove completeness and convergence of the resulting family of algorithms. We then analyze the impact of abstraction in an extensive empirical study in real-time pathfinding. Abstraction is found to improve efficiency by providing better trading offs between planning time, learning speed and other negatively correlated performance measures.


2021 ◽  
Author(s):  
Mirko Klukas ◽  
Sugandha Sharma ◽  
Yilun Du ◽  
Tomas Lozano-Perez ◽  
Leslie Pack Kaelbling ◽  
...  

When animals explore spatial environments, their representations often fragment into multiple maps. What determines these map fragmentations, and can we predict where they will occur with simple principles? We pose the problem of fragmentation of an environment as one of (online) spatial clustering. Taking inspiration from the notion of a "contiguous region" in robotics, we develop a theory in which fragmentation decisions are driven by surprisal. When this criterion is implemented with boundary, grid, and place cells in various environments, it produces map fragmentations from the first exploration of each space. Augmented with a long-term spatial memory and a rule similar to the distance-dependent Chinese Restaurant Process for selecting among relevant memories, the theory predicts the reuse of map fragments in environments with repeating substructures. Our model provides a simple rule for generating spatial state abstractions and predicts map fragmentations observed in electrophysiological recordings. It further predicts that there should be "fragmentation decision" or "fracture" cells, which in multicompartment environments could be called "doorway" cells. Finally, we show that the resulting abstractions can lead to large (orders of magnitude) improvements in the ability to plan and navigate through complex environments.


Sign in / Sign up

Export Citation Format

Share Document