Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes

Journal of Artificial Intelligence Research ◽

10.1613/jair.761 ◽

2001 ◽

Vol 14 ◽

pp. 29-51 ◽

Cited By ~ 39

Author(s):

N. L. Zhang ◽

W. Zhang

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Benchmark Problems ◽

Test Problems ◽

Value Iteration ◽

Planning Under Uncertainty ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable ◽

Number Of Iterations

Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very effective: It enabled value iteration to converge after only a few iterations on all the test problems.

Download Full-text

BOUNDED-PARAMETER PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES: FRAMEWORK AND ALGORITHM

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488513500396 ◽

2013 ◽

Vol 21 (06) ◽

pp. 821-863 ◽

Cited By ~ 2

Author(s):

YAODONG NI ◽

ZHI-QIANG LIU

Keyword(s):

Markov Decision Processes ◽

Real Life ◽

Decision Processes ◽

Iteration Algorithm ◽

Value Iteration ◽

Planning Under Uncertainty ◽

Markov Decision ◽

Real Life Situation ◽

Partially Observable Markov ◽

Partially Observable

Partially observable Markov decision processes (POMDPs) are powerful for planning under uncertainty. However, it is usually impractical to employ a POMDP with exact parameters to model the real-life situation precisely, due to various reasons such as limited data for learning the model, inability of exact POMDPs to model dynamic situations, etc. In this paper, assuming that the parameters of POMDPs are imprecise but bounded, we formulate the framework of bounded-parameter partially observable Markov decision processes (BPOMDPs). A modified value iteration is proposed as a basic strategy for tackling parameter imprecision in BPOMDPs. In addition, we design the UL-based value iteration algorithm, in which each value backup is based on two sets of vectors called U-set and L-set. We propose four strategies for computing U-set and L-set. We analyze theoretically the computational complexity and the reward loss of the algorithm. The effectiveness and robustness of the algorithm are shown empirically.

Download Full-text

Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/329 ◽

2019 ◽

Author(s):

Mahsa Ghasemi ◽

Ufuk Topcu

Keyword(s):

Markov Decision Processes ◽

Active Role ◽

Decision Processes ◽

Iteration Algorithm ◽

Value Iteration ◽

Greedy Strategy ◽

Markov Decision ◽

Partially Observable Markov ◽

Observation Selection ◽

Partially Observable

In conventional partially observable Markov decision processes, the observations that the agent receives originate from fixed known distributions. However, in a variety of real-world scenarios, the agent has an active role in its perception by selecting which observations to receive. We avoid combinatorial expansion of the action space from integration of planning and perception decisions, through a greedy strategy for observation selection that minimizes an information-theoretic measure of the state uncertainty. We develop a novel point-based value iteration algorithm that incorporates this greedy strategy to pick perception actions for each sampled belief point in each iteration. As a result, not only the solver requires less belief points to approximate the reachable subspace of the belief simplex, but it also requires less computation per iteration. Further, we prove that the proposed algorithm achieves a near-optimal guarantee on value function with respect to an optimal perception strategy, and demonstrate its performance empirically.

Download Full-text

Region-based value iteration for partially observable Markov decision processes

Proceedings of the 23rd international conference on Machine learning - ICML '06 ◽

10.1145/1143844.1143915 ◽

2006 ◽

Cited By ~ 3

Author(s):

Hui Li ◽

Xuejun Liao ◽

Lawrence Carin

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Value Iteration ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Oracular Partially Observable Markov Decision Processes: A Very Special Case

Proceedings 2007 IEEE International Conference on Robotics and Automation ◽

10.1109/robot.2007.363691 ◽

2007 ◽

Cited By ~ 3

Author(s):

Nicholas Armstrong-Crews ◽

Manuela Veloso

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable ◽

Special Case

Download Full-text

Active Chemical Sensing With Partially Observable Markov Decision Processes

10.1063/1.3156617 ◽

2009 ◽

Cited By ~ 2

Author(s):

Rakesh Gosangi ◽

Ricardo Gutierrez-Osuna ◽

Matteo Pardo ◽

Giorgio Sberveglieri

Keyword(s):

Markov Decision Processes ◽

Chemical Sensing ◽

Decision Processes ◽

Active Chemical ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Scalable grid‐based approximation algorithms for partially observable Markov decision processes

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6743 ◽

2021 ◽

Author(s):

Can Kavaklioglu ◽

Mucahit Cevik

Keyword(s):

Approximation Algorithms ◽

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable ◽

Grid Based

Download Full-text

Quasi-Deterministic Partially Observable Markov Decision Processes

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-642-10677-4_27 ◽

2009 ◽

pp. 237-246 ◽

Cited By ~ 2

Author(s):

Camille Besse ◽

Brahim Chaib-draa

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Partially Observable Markov Decision Processes

Universitext - Markov Decision Processes with Applications to Finance ◽

10.1007/978-3-642-18324-9_5 ◽

2011 ◽

pp. 147-174

Author(s):

Nicole Bäuerle ◽

Ulrich Rieder

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

A Continuous Internal-State Controller for Partially Observable Markov Decision Processes

Artificial Neural Networks - ICANN 2008 - Lecture Notes in Computer Science ◽

10.1007/978-3-540-87536-9_41 ◽

2008 ◽

pp. 397-406

Author(s):

Yuki Taniguchi ◽

Takeshi Mori ◽

Shin Ishii

Keyword(s):

Markov Decision Processes ◽

Internal State ◽

Decision Processes ◽

State Controller ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Monotonicity properties for two-action partially observable Markov decision processes on partially ordered spaces

European Journal of Operational Research ◽

10.1016/j.ejor.2019.10.003 ◽

2020 ◽

Vol 282 (3) ◽

pp. 936-944

Author(s):

Erik Miehling ◽

Demosthenis Teneketzis

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Partially Ordered ◽

Monotonicity Properties ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable ◽

Partially Ordered Spaces

Download Full-text