Simulation-based policy generation using large-scale Markov decision processes

Author(s):  
C.W. Zobel ◽  
W.T. Scherer
Author(s):  
Ruiyang Song ◽  
Kuang Xu

We propose and analyze a temporal concatenation heuristic for solving large-scale finite-horizon Markov decision processes (MDP), which divides the MDP into smaller sub-problems along the time horizon and generates an overall solution by simply concatenating the optimal solutions from these sub-problems. As a “black box” architecture, temporal concatenation works with a wide range of existing MDP algorithms. Our main results characterize the regret of temporal concatenation compared to the optimal solution. We provide upper bounds for general MDP instances, as well as a family of MDP instances in which the upper bounds are shown to be tight. Together, our results demonstrate temporal concatenation's potential of substantial speed-up at the expense of some performance degradation.


2017 ◽  
Vol 36 (2) ◽  
pp. 231-258 ◽  
Author(s):  
Shayegan Omidshafiei ◽  
Ali–Akbar Agha–Mohammadi ◽  
Christopher Amato ◽  
Shih–Yuan Liu ◽  
Jonathan P How ◽  
...  

This work focuses on solving general multi-robot planning problems in continuous spaces with partial observability given a high-level domain description. Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) are general models for multi-robot coordination problems. However, representing and solving Dec-POMDPs is often intractable for large problems. This work extends the Dec-POMDP model to the Decentralized Partially Observable Semi-Markov Decision Process (Dec-POSMDP) to take advantage of the high-level representations that are natural for multi-robot problems and to facilitate scalable solutions to large discrete and continuous problems. The Dec-POSMDP formulation uses task macro-actions created from lower-level local actions that allow for asynchronous decision-making by the robots, which is crucial in multi-robot domains. This transformation from Dec-POMDPs to Dec-POSMDPs with a finite set of automatically-generated macro-actions allows use of efficient discrete-space search algorithms to solve them. The paper presents algorithms for solving Dec-POSMDPs, which are more scalable than previous methods since they can incorporate closed-loop belief space macro-actions in planning. These macro-actions are automatically constructed to produce robust solutions. The proposed algorithms are then evaluated on a complex multi-robot package delivery problem under uncertainty, showing that our approach can naturally represent realistic problems and provide high-quality solutions for large-scale problems.


2007 ◽  
Vol 7 (1) ◽  
pp. 59-92 ◽  
Author(s):  
Hyeong Soo Chang ◽  
Michael C. Fu ◽  
Jiaqiao Hu ◽  
Steven I. Marcus

Author(s):  
Hyeong Soo Chang ◽  
Jiaqiao Hu ◽  
Michael C. Fu ◽  
Steven I. Marcus

Sign in / Sign up

Export Citation Format

Share Document