Expectation Optimization with Probabilistic Guarantees in POMDPs with Discounted-Sum Objectives

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/652 ◽

2018 ◽

Author(s):

Krishnendu Chatterjee ◽

Adrián Elgyütt ◽

Petr Novotný ◽

Owen Rouillé

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Decision Making Under Uncertainty ◽

Risk Averse ◽

Wide Range ◽

Markov Decision ◽

Expectation Optimization ◽

Low Probability ◽

Partially Observable ◽

Standard Framework

Partially-observable Markov decision processes (POMDPs) with discounted-sum payoff are a standard framework to model a wide range of problems related to decision making under uncertainty. Traditionally, the goal has been to obtain policies that optimize the expectation of the discounted-sum payoff. A key drawback of the expectation measure is that even low probability events with extreme payoff can significantly affect the expectation, and thus the obtained policies are not necessarily risk averse. An alternate approach is to optimize the probability that the payoff is above a certain threshold, which allows to obtain risk-averse policies, but ignore optimization of the expectation. We consider the expectation optimization with probabilistic guarantee (EOPG) problem where the goal is to optimize the expectation ensuring that the payoff is above a given threshold with at least a specified probability. We present several results on the EOPG problem, including the first algorithm to solve it.

Download Full-text

Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes

Frontiers in Computational Neuroscience ◽

10.3389/fncom.2010.00146 ◽

2010 ◽

Vol 4 ◽

Cited By ~ 75

Author(s):

Rajesh P. N. Rao

Keyword(s):

Decision Making ◽

Markov Decision Processes ◽

Neural Model ◽

Decision Processes ◽

Decision Making Under Uncertainty ◽

Model Based ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

An Introduction to Fully and Partially Observable Markov Decision Processes

Decision Theory Models for Applications in Artificial Intelligence ◽

10.4018/978-1-60960-165-2.ch003 ◽

2012 ◽

pp. 33-62 ◽

Cited By ~ 2

Author(s):

Pascal Poupart

Keyword(s):

Decision Making ◽

Markov Decision Processes ◽

Decision Processes ◽

Sequential Decision Making ◽

Decision Making Under Uncertainty ◽

Sequential Decision ◽

Markov Decision ◽

The Common ◽

Partially Observable Markov ◽

Partially Observable

The goal of this chapter is to provide an introduction to Markov decision processes as a framework for sequential decision making under uncertainty. The aim of this introduction is to provide practitioners with a basic understanding of the common modeling and solution techniques. Hence, we will not delve into the details of the most recent algorithms, but rather focus on the main concepts and the issues that impact deployment in practice. More precisely, we will review fully and partially observable Markov decision processes, describe basic algorithms to find good policies and discuss modeling/computational issues that arise in practice.

Download Full-text

Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i06.6531 ◽

2020 ◽

Vol 34 (06) ◽

pp. 9794-9801

Author(s):

Tomáš Brázdil ◽

Krishnendu Chatterjee ◽

Petr Novotný ◽

Jiří Vahala

Keyword(s):

Markov Decision Processes ◽

Negative Impact ◽

Optimization Criterion ◽

Decision Processes ◽

Risk Averse ◽

Sequential Decision ◽

Failure State ◽

Markov Decision ◽

Planning Algorithm ◽

Low Probability

Markov decision processes (MDPs) are the defacto framework for sequential decision making in the presence of stochastic uncertainty. A classical optimization criterion for MDPs is to maximize the expected discounted-sum payoff, which ignores low probability catastrophic events with highly negative impact on the system. On the other hand, risk-averse policies require the probability of undesirable events to be below a given threshold, but they do not account for optimization of the expected payoff. We consider MDPs with discounted-sum payoff with failure states which represent catastrophic outcomes. The objective of risk-constrained planning is to maximize the expected discounted-sum payoff among risk-averse policies that ensure the probability to encounter a failure state is below a desired threshold. Our main contribution is an efficient risk-constrained planning algorithm that combines UCT-like search with a predictor learned through interaction with the MDP (in the style of AlphaZero) and with a risk-constrained action selection via linear programming. We demonstrate the effectiveness of our approach with experiments on classical MDPs from the literature, including benchmarks with an order of 106 states.

Download Full-text

Temporal concatenation for Markov decision processes

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964821000206 ◽

2021 ◽

pp. 1-28

Author(s):

Ruiyang Song ◽

Kuang Xu

Keyword(s):

Markov Decision Processes ◽

Large Scale ◽

Optimal Solution ◽

Upper Bounds ◽

Black Box ◽

Decision Processes ◽

Optimal Solutions ◽

Wide Range ◽

Markov Decision ◽

Speed Up

We propose and analyze a temporal concatenation heuristic for solving large-scale finite-horizon Markov decision processes (MDP), which divides the MDP into smaller sub-problems along the time horizon and generates an overall solution by simply concatenating the optimal solutions from these sub-problems. As a “black box” architecture, temporal concatenation works with a wide range of existing MDP algorithms. Our main results characterize the regret of temporal concatenation compared to the optimal solution. We provide upper bounds for general MDP instances, as well as a family of MDP instances in which the upper bounds are shown to be tight. Together, our results demonstrate temporal concatenation's potential of substantial speed-up at the expense of some performance degradation.

Download Full-text

Oracular Partially Observable Markov Decision Processes: A Very Special Case

Proceedings 2007 IEEE International Conference on Robotics and Automation ◽

10.1109/robot.2007.363691 ◽

2007 ◽

Cited By ~ 3

Author(s):

Nicholas Armstrong-Crews ◽

Manuela Veloso

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable ◽

Special Case

Download Full-text

Active Chemical Sensing With Partially Observable Markov Decision Processes

10.1063/1.3156617 ◽

2009 ◽

Cited By ~ 2

Author(s):

Rakesh Gosangi ◽

Ricardo Gutierrez-Osuna ◽

Matteo Pardo ◽

Giorgio Sberveglieri

Keyword(s):

Markov Decision Processes ◽

Chemical Sensing ◽

Decision Processes ◽

Active Chemical ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Scalable grid‐based approximation algorithms for partially observable Markov decision processes

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6743 ◽

2021 ◽

Author(s):

Can Kavaklioglu ◽

Mucahit Cevik

Keyword(s):

Approximation Algorithms ◽

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable ◽

Grid Based

Download Full-text

Quasi-Deterministic Partially Observable Markov Decision Processes

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-642-10677-4_27 ◽

2009 ◽

pp. 237-246 ◽

Cited By ~ 2

Author(s):

Camille Besse ◽

Brahim Chaib-draa

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

Partially Observable Markov Decision Processes

Universitext - Markov Decision Processes with Applications to Finance ◽

10.1007/978-3-642-18324-9_5 ◽

2011 ◽

pp. 147-174

Author(s):

Nicole Bäuerle ◽

Ulrich Rieder

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text

A Continuous Internal-State Controller for Partially Observable Markov Decision Processes

Artificial Neural Networks - ICANN 2008 - Lecture Notes in Computer Science ◽

10.1007/978-3-540-87536-9_41 ◽

2008 ◽

pp. 397-406

Author(s):

Yuki Taniguchi ◽

Takeshi Mori ◽

Shin Ishii

Keyword(s):

Markov Decision Processes ◽

Internal State ◽

Decision Processes ◽

State Controller ◽

Markov Decision ◽

Partially Observable Markov ◽

Partially Observable

Download Full-text