A pause control approach to the value iteration scheme in average Markov decision processes

The standard Value Iteration (VI) algorithm, referred to as Value Iteration Pre-Jacobi (PJ-VI) algorithm, is the simplest Value Iteration scheme, and the well-known algorithm for solving Markov Decision Processes (MDPs). In the literature, several versions of VI algorithm were developed in order to reduce the number of iterations: the VI Jacobi (VI-J) algorithm, the Value Iteration Pre-Gauss-Seidel (VI-PGS) algorithm and the VI Gauss-Seidel (VI-GS) algorithm. In this article, the authors combine the advantages of VI Pre Gauss-Seidel algorithm, the decomposition technique and the parallelism in order to propose a new Parallel Hierarchical VI Pre-Gauss-Seidel algorithm. Experimental results show that their approach performs better than the traditional VI schemes in the case where the global problem can be decomposed into smaller problems.

Download Full-text

Serial and parallel value iteration algorithms for discounted Markov decision processes

European Journal of Operational Research ◽

10.1016/0377-2217(93)90061-q ◽

1993 ◽

Vol 67 (2) ◽

pp. 188-203 ◽

Cited By ~ 3

Author(s):

T.W. Archibald ◽

K.I.M. McKinnon ◽

L.C. Thomas

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Value Iteration ◽

Markov Decision

Download Full-text

A K-step look-ahead analysis of value iteration algorithms for Markov decision processes

European Journal of Operational Research ◽

10.1016/0377-2217(94)00208-8 ◽

1996 ◽

Vol 88 (3) ◽

pp. 622-636 ◽

Cited By ~ 5

Author(s):

Meir Herzberg ◽

Uri Yechiali

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Value Iteration ◽

Look Ahead ◽

Markov Decision

Download Full-text

Accelerating Procedures of the Value Iteration Algorithm for Discounted Markov Decision Processes, Based on a One-Step Lookahead Analysis

Operations Research ◽

10.1287/opre.42.5.940 ◽

1994 ◽

Vol 42 (5) ◽

pp. 940-946 ◽

Cited By ~ 10

Author(s):

Meir Herzberg ◽

Uri Yechiali

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Iteration Algorithm ◽

Value Iteration ◽

Markov Decision ◽

One Step ◽

Value Iteration Algorithm

Download Full-text

A Modified Value Iteration Algorithm for Discounted Markov Decision Processes

Journal of Electronic Commerce in Organizations ◽

10.4018/jeco.2015070104 ◽

2015 ◽

Vol 13 (3) ◽

pp. 47-57 ◽

Cited By ~ 1

Author(s):

Sanaa Chafik ◽

Cherki Daoui

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Iteration Algorithm ◽

Value Iteration ◽

Decomposition Technique ◽

Artificial Data ◽

Markov Decision ◽

Speed Up ◽

Value Iteration Algorithm

As many real applications need a large amount of states, the classical methods are intractable for solving large Markov Decision Processes. The decomposition technique basing on the topology of each state in the associated graph and the parallelization technique are very useful methods to cope with this problem. In this paper, the authors propose a Modified Value Iteration algorithm, adding the parallelism technique. They test their implementation on artificial data using an Open MP that offers a significant speed-up.

Download Full-text

Criteria for selecting the relaxation factor of the value iteration algorithm for undiscounted Markov and semi-Markov decision processes

Operations Research Letters ◽

10.1016/0167-6377(91)90059-x ◽

1991 ◽

Vol 10 (4) ◽

pp. 193-202 ◽

Cited By ~ 6

Author(s):

Meir Herzberg ◽

Uri Yechiali

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Iteration Algorithm ◽

Value Iteration ◽

Relaxation Factor ◽

Markov Decision ◽

Value Iteration Algorithm

Download Full-text

A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs

Kybernetika ◽

10.14736/kyb-2019-1-0081 ◽

2019 ◽

pp. 81-113

Author(s):

Óscar Vega-Amaya ◽

Joaquín López-Borbón

Keyword(s):

Markov Decision Processes ◽

Average Cost ◽

Decision Processes ◽

Perturbation Approach ◽

Value Iteration ◽

Markov Decision ◽

Approximate Value Iteration

Download Full-text

Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/329 ◽

2019 ◽

Author(s):

Mahsa Ghasemi ◽

Ufuk Topcu

Keyword(s):

Markov Decision Processes ◽

Active Role ◽

Decision Processes ◽

Iteration Algorithm ◽

Value Iteration ◽

Greedy Strategy ◽

Markov Decision ◽

Partially Observable Markov ◽

Observation Selection ◽

Partially Observable

In conventional partially observable Markov decision processes, the observations that the agent receives originate from fixed known distributions. However, in a variety of real-world scenarios, the agent has an active role in its perception by selecting which observations to receive. We avoid combinatorial expansion of the action space from integration of planning and perception decisions, through a greedy strategy for observation selection that minimizes an information-theoretic measure of the state uncertainty. We develop a novel point-based value iteration algorithm that incorporates this greedy strategy to pick perception actions for each sampled belief point in each iteration. As a result, not only the solver requires less belief points to approximate the reachable subspace of the belief simplex, but it also requires less computation per iteration. Further, we prove that the proposed algorithm achieves a near-optimal guarantee on value function with respect to an optimal perception strategy, and demonstrate its performance empirically.

Download Full-text

Universal Value Iteration Networks: When Spatially-Invariant Is Not Universal

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6157 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6778-6785

Author(s):

Li Zhang ◽

Xin Li ◽

Sen Chen ◽

Hongyu Zang ◽

Jie Huang ◽

...

Keyword(s):

Markov Decision Processes ◽

Network Structure ◽

Decision Processes ◽

Convolution Kernel ◽

Value Iteration ◽

Transition Dynamics ◽

Similar Performance ◽

Markov Decision ◽

Spatially Variant

In this paper, we first formally define the problem set of spatially invariant Markov Decision Processes (MDPs), and show that Value Iteration Networks (VIN) and its extensions are computationally bounded to it due to the use of the convolution kernel. To generalize VIN to spatially variant MDPs, we propose Universal Value Iteration Networks (UVIN). In comparison with VIN, UVIN automatically learns a flexible but compact network structure to encode the transition dynamics of the problems and support the differentiable planning module. We evaluate UVIN with both spatially invariant and spatially variant tasks, including navigation in regular maze, chessboard maze, and Mars, and Minecraft item syntheses. Results show that UVIN can achieve similar performance as VIN and its extensions on spatially invariant tasks, and significantly outperforms other models on more general problems.

Download Full-text

Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes

Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms ◽

10.1137/1.9781611975031.50 ◽

2018 ◽

pp. 770-787 ◽

Cited By ~ 2

Author(s):

Aaron Sidford ◽

Mengdi Wang ◽

Xian Wu ◽

Yinyu Ye

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Value Iteration ◽

Markov Decision

Download Full-text