Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes

Maxime Bouton; Jana Tumova; Mykel J. Kochenderfer

doi:10.1609/aaai.v34i06.6563

Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i06.6563 ◽

2020 ◽

Vol 34 (06) ◽

pp. 10061-10068

Author(s):

Maxime Bouton ◽

Jana Tumova ◽

Mykel J. Kochenderfer

Keyword(s):

Autonomous Systems ◽

Planning Problem ◽

Value Iteration ◽

Maximum Probability ◽

Markov Decision ◽

Iteration Methods ◽

Partially Observable Markov ◽

Temporal Logic Formula ◽

Partially Observable ◽

State Of The Environment

Autonomous systems are often required to operate in partially observable environments. They must reliably execute a specified objective even with incomplete information about the state of the environment. We propose a methodology to synthesize policies that satisfy a linear temporal logic formula in a partially observable Markov decision process (POMDP). By formulating a planning problem, we show how to use point-based value iteration methods to efficiently approximate the maximum probability of satisfying a desired logical formula and compute the associated belief state policy. We demonstrate that our method scales to large POMDP domains and provides strong bounds on the performance of the resulting policy.

Download Full-text

Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/329 ◽

2019 ◽

Author(s):

Mahsa Ghasemi ◽

Ufuk Topcu

Keyword(s):

Markov Decision Processes ◽

Active Role ◽

Decision Processes ◽

Iteration Algorithm ◽

Value Iteration ◽

Greedy Strategy ◽

Markov Decision ◽

Partially Observable Markov ◽

Observation Selection ◽

Partially Observable

In conventional partially observable Markov decision processes, the observations that the agent receives originate from fixed known distributions. However, in a variety of real-world scenarios, the agent has an active role in its perception by selecting which observations to receive. We avoid combinatorial expansion of the action space from integration of planning and perception decisions, through a greedy strategy for observation selection that minimizes an information-theoretic measure of the state uncertainty. We develop a novel point-based value iteration algorithm that incorporates this greedy strategy to pick perception actions for each sampled belief point in each iteration. As a result, not only the solver requires less belief points to approximate the reachable subspace of the belief simplex, but it also requires less computation per iteration. Further, we prove that the proposed algorithm achieves a near-optimal guarantee on value function with respect to an optimal perception strategy, and demonstrate its performance empirically.

Download Full-text

Decision Making in Complex Multiagent Contexts: A Tale of Two Frameworks

AI Magazine ◽

10.1609/aimag.v33i4.2402 ◽

2012 ◽

Vol 33 (4) ◽

pp. 82 ◽

Cited By ~ 6

Author(s):

Prashant J. Doshi

Keyword(s):

Decision Making ◽

Markov Decision Process ◽

Decision Process ◽

Partial Information ◽

Autonomous Systems ◽

Relevant Research ◽

Physical Context ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Decision making is a key feature of autonomous systems. It involves choosing optimally between different lines of action in various information contexts that range from perfectly knowing all aspects of the decision problem to having just partial knowledge about it. The physical context often includes other interacting autonomous systems, typically called agents. In this article, I focus on decision making in a multiagent context with partial information about the problem. Relevant research in this complex but realistic setting has converged around two complementary, general frameworks and also introduced myriad specializations on its way. I put the two frameworks, decentralized partially observable Markov decision process (Dec-POMDP) and the interactive partially observable Markov decision process (I-POMDP), in context and review the foundational algorithms for these frameworks, while briefly discussing the advances in their specializations. I conclude by examining the avenues that research pertaining to these frameworks is pursuing.

Download Full-text

Partially Observable Markov Decision Processes and Robotics

Annual Review of Control Robotics and Autonomous Systems ◽

10.1146/annurev-control-042920-092451 ◽

2022 ◽

Vol 5 (1) ◽

Author(s):

Hanna Kurniawati

Keyword(s):

Autonomous Systems ◽

Optimal Solution ◽

Lessons Learned ◽

Annual Review ◽

Publication Date ◽

Mathematical Framework ◽

Planning Under Uncertainty ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Planning under uncertainty is critical to robotics. The partially observable Markov decision process (POMDP) is a mathematical framework for such planning problems. POMDPs are powerful because of their careful quantification of the nondeterministic effects of actions and the partial observability of the states. But for the same reason, they are notorious for their high computational complexity and have been deemed impractical for robotics. However, over the past two decades, the development of sampling-based approximate solvers has led to tremendous advances in POMDP-solving capabilities. Although these solvers do not generate the optimal solution, they can compute good POMDP solutions that significantly improve the robustness of robotics systems within reasonable computational resources, thereby making POMDPs practical for many realistic robotics problems. This article presents a review of POMDPs, emphasizing computational issues that have hindered their practicality in robotics and ideas in sampling-based solvers that have alleviated such difficulties, together with lessons learned from applying POMDPs to physical robots. Expected final online publication date for the Annual Review of Control, Robotics, and Autonomous Systems, Volume 5 is May 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Download Full-text

Region-based value iteration for partially observable Markov decision processes

Proceedings of the 23rd international conference on Machine learning - ICML '06 ◽

10.1145/1143844.1143915 ◽

2006 ◽

Cited By ~ 3

Author(s):

Hui Li ◽

Xuejun Liao ◽

Lawrence Carin

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Value Iteration ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes

Journal of Artificial Intelligence Research ◽

10.1613/jair.761 ◽

2001 ◽

Vol 14 ◽

pp. 29-51 ◽

Cited By ~ 39

Author(s):

N. L. Zhang ◽

W. Zhang

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Benchmark Problems ◽

Test Problems ◽

Value Iteration ◽

Planning Under Uncertainty ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable ◽

Number Of Iterations

Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very effective: It enabled value iteration to converge after only a few iterations on all the test problems.

Download Full-text

BOUNDED-PARAMETER PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES: FRAMEWORK AND ALGORITHM

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488513500396 ◽

2013 ◽

Vol 21 (06) ◽

pp. 821-863 ◽

Cited By ~ 2

Author(s):

YAODONG NI ◽

ZHI-QIANG LIU

Keyword(s):

Markov Decision Processes ◽

Real Life ◽

Decision Processes ◽

Iteration Algorithm ◽

Value Iteration ◽

Planning Under Uncertainty ◽

Markov Decision ◽

Real Life Situation ◽

Partially Observable Markov ◽

Partially Observable

Partially observable Markov decision processes (POMDPs) are powerful for planning under uncertainty. However, it is usually impractical to employ a POMDP with exact parameters to model the real-life situation precisely, due to various reasons such as limited data for learning the model, inability of exact POMDPs to model dynamic situations, etc. In this paper, assuming that the parameters of POMDPs are imprecise but bounded, we formulate the framework of bounded-parameter partially observable Markov decision processes (BPOMDPs). A modified value iteration is proposed as a basic strategy for tackling parameter imprecision in BPOMDPs. In addition, we design the UL-based value iteration algorithm, in which each value backup is based on two sets of vectors called U-set and L-set. We propose four strategies for computing U-set and L-set. We analyze theoretically the computational complexity and the reward loss of the algorithm. The effectiveness and robustness of the algorithm are shown empirically.

Download Full-text

A Partially Observable Markov Decision Process-Based Blackboard Architecture for Cognitive Agents in Partially Observable Environments

IEEE Transactions on Cognitive and Developmental Systems ◽

10.1109/tcds.2020.3034428 ◽

2020 ◽

pp. 1-1

Author(s):

Hideaki Itoh ◽

Hidehiko Nakano ◽

Ryota Tokushima ◽

Hisao Fukumoto ◽

Hiroshi Wakuya

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Cognitive Agents ◽

Blackboard Architecture ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Oracular Partially Observable Markov Decision Processes: A Very Special Case

Proceedings 2007 IEEE International Conference on Robotics and Automation ◽

10.1109/robot.2007.363691 ◽

2007 ◽

Cited By ~ 3

Author(s):

Nicholas Armstrong-Crews ◽

Manuela Veloso

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable ◽

Special Case

Download Full-text

Partially observable Markov decision model for the treatment of early Prostate Cancer

OPSEARCH ◽

10.1007/s12597-010-0015-0 ◽

2010 ◽

Vol 47 (2) ◽

pp. 105-117 ◽

Cited By ~ 1

Author(s):

John E. Goulionis ◽

B. K. Koutsiumaris

Keyword(s):

Prostate Cancer ◽

Decision Model ◽

Early Prostate Cancer ◽

Markov Decision ◽

Markov Decision Model ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Partially observable Markov Decision Process to prioritize software defects

10.32920/ryerson.14638470 ◽

2021 ◽

Author(s):

Shirin Akbarinasaji

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Tracking System ◽

Dependency Graph ◽

Relative Importance ◽

Bug Reports ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable ◽

Issue Tracking System

Background: Bug tracking systems receive many bug reports daily. Although the software quality team aims to identify and resolve these bugs, they are never able to fix all of the reported bugs in the issue tracking system before the release deadline. However, postponing the bug fixing may have some consequences. Prioritization of bug reports will help the software manager decide which bugs to fix and which bugs to postpone. Typically, bug reports are prioritized based on the severity, priority, time and effort for fixing, customer pressure, etc. Aim: Previous studies have shown that these factors may not be appropriate for prioritization. Therefore, relying on them to automate bug prioritization might be misleading. In this dissertation, we aim to prioritize bug reports with respect to the consequence of not fixing the bugs in terms of their relative importance in the issue tracking system. Method: In order to measure the relative importance of bugs in the issue tracking system, we propose the construction of a dependency graph based on the reported dependency-blocking information in the issue tracking system. Two metrics, namely depth and degree, are used to measure the relative importance of the bugs. However, there is uncertainty in the dependency graph structure as the dependency information is discovered manually and gradually. Owing to this uncertainty, prioritization of bugs in the descending order of depth and degree may be misleading. To handle the uncertainty, we propose a novel approach of a partially observable Markov decision process (POMDP) and partially observable Monte Carlo planning (POMCP). Result: To check the feasibility of the proposed approach, we analyzed seven years of data from an open source project, Firefox, and a commercial project. We compared the proposed policy with the developer policy, maximum policy, and random policy. Conclusion: The results suggest that software practitioners do not consider the relative importance of bugs in their current practice. The proposed framework can be combined with practitioners’ expertise to prioritize bugs more effectively and take the depth and degree of bugs into account. In practice, the POMDP framework with the POMCP planner can help practitioners sequentially select bugs to minimize the connectivity of the dependency graph.

Download Full-text