scholarly journals Real-Time Planning as Decision-Making under Uncertainty

Author(s):  
Andrew Mitchell ◽  
Wheeler Ruml ◽  
Fabian Spaniol ◽  
Jorg Hoffmann ◽  
Marek Petrik

In real-time planning, an agent must select the next action to take within a fixed time bound. Many popular real-time heuristic search methods approach this by expanding nodes using time-limited A* and selecting the action leading toward the frontier node with the lowest f value. In this paper, we reconsider real-time planning as a problem of decision-making under uncertainty. We propose treating heuristic values as uncertain evidence and we explore several backup methods for aggregating this evidence. We then propose a novel lookahead strategy that expands nodes to minimize risk, the expected regret in case a non-optimal action is chosen. We evaluate these methods in a simple synthetic benchmark and the sliding tile puzzle and find that they outperform previous methods. This work illustrates how uncertainty can arise even when solving deterministic planning problems, due to the inherent ignorance of time-limited search algorithms about those portions of the state space that they have not computed, and how an agent can benefit from explicitly metareasoning about this uncertainty.

2007 ◽  
Vol 30 ◽  
pp. 51-100 ◽  
Author(s):  
V. Bulitko ◽  
N. Sturtevant ◽  
J. Lu ◽  
T. Yau

Real-time heuristic search methods are used by situated agents in applications that require the amount of planning per move to be independent of the problem size. Such agents plan only a few actions at a time in a local search space and avoid getting trapped in local minima by improving their heuristic function over time. We extend a wide class of real-time search algorithms with automatically-built state abstraction and prove completeness and convergence of the resulting family of algorithms. We then analyze the impact of abstraction in an extensive empirical study in real-time pathfinding. Abstraction is found to improve efficiency by providing better trading offs between planning time, learning speed and other negatively correlated performance measures.


Author(s):  
Tianyi Gu

Heuristic search methods are widely used in many real-world autonomous systems. Yet, people always want to solve search problems that are larger than time allows. To address these challenging problems, even suboptimally, a planning agent should be smart enough to intelligently allocate its computational resources, to think carefully about where in the state space it should spend time searching. For finding optimal solutions, we must examine every node that is not provably too expensive. In contrast, to find suboptimal solutions when under time pressure, we need to be very selective about which nodes to examine. In this work, we will demonstrate that estimates of uncertainty, represented as belief distributions, can be used to drive search effectively. This type of algorithmic approach is known as metareasoning, which refers to reasoning about which reasoning to do. We will provide examples of improved algorithms for real-time search, bounded-cost search, and situated planning.


Author(s):  
Takeshi Fukase ◽  
Yuichi Kobayashi ◽  
Ryuichi Ueda ◽  
Takanobu Kawabe ◽  
Tamio Arai

Author(s):  
Andreas A. Malikopoulos ◽  
Panos Y. Papalambros ◽  
Dennis N. Assanis

Modeling dynamic systems incurring stochastic disturbances for deriving a control policy is a ubiquitous task in engineering. However, in some instances obtaining a model of a system may be impractical or impossible. Alternative approaches have been developed using a simulation-based stochastic framework, in which the system interacts with its environment in real time and obtains information that can be processed to produce an optimal control policy. In this context, the problem of developing a policy for controlling the system’s behavior is formulated as a sequential decision-making problem under uncertainty. This paper considers real-time sequential decision-making under uncertainty modeled as a Markov Decision Process (MDP). A state-space representation model is constructed through a learning mechanism and is used to improve system performance over time. The model allows decision making based on gradually enhanced knowledge of system response as it transitions from one state to another, in conjunction with actions taken at each state. A learning algorithm is implemented realizing in real time the optimal control policy associated with the state transitions. The proposed method is demonstrated on the single cart-pole balancing problem and a vehicle cruise control problem.


Author(s):  
Rebecca Eifler ◽  
Maximilian Fickert ◽  
Jörg Hoffmann ◽  
Wheeler Ruml

In real-time planning, the planner must select the next action within a fixed time bound. Because a complete plan may not have been found, the selected action might not lead to a goal and the agent may need to return to its current state. To preserve completeness, real-time search methods incorporate learning, in which heuristic values are updated. Previous work in real-time search has used table-based heuristics, in which the values of states are updated individually. In this paper, we explore the use of abstraction-based heuristics. By refining the abstraction on-line, we can update the values of multiple states, including ones the agent has not yet generated. We test this idea empirically using Cartesian abstractions in the Fast Downward planner. Results on various benchmarks, including the sliding tile puzzle and several IPC domains, indicate that the approach can improve performance compared to traditional heuristic updating. This work brings abstraction refinement, a powerful technique from offline planning, into the real-time setting.


Sign in / Sign up

Export Citation Format

Share Document