stochastic shortest path
Recently Published Documents


TOTAL DOCUMENTS

95
(FIVE YEARS 18)

H-INDEX

17
(FIVE YEARS 1)

Author(s):  
Petr Tomášek ◽  
Karel Horák ◽  
Aditya Aradhye ◽  
Branislav Bošanský ◽  
Krishnendu Chatterjee

We study the two-player zero-sum extension of the partially observable stochastic shortest-path problem where one agent has only partial information about the environment. We formulate this problem as a partially observable stochastic game (POSG): given a set of target states and negative rewards for each transition, the player with imperfect information maximizes the expected undiscounted total reward until a target state is reached. The second player with the perfect information aims for the opposite. We base our formalism on POSGs with one-sided observability (OS-POSGs) and give the following contributions: (1) we introduce a novel heuristic search value iteration algorithm that iteratively solves depth-limited variants of the game, (2) we derive the bound on the depth guaranteeing an arbitrary precision, (3) we propose a novel upper-bound estimation that allows early terminations, and (4) we experimentally evaluate the algorithm on a pursuit-evasion game.


Author(s):  
Aviv Rosenberg ◽  
Yishay Mansour

Stochastic shortest path (SSP) is a well-known problem in planning and control, in which an agent has to reach a goal state in minimum total expected cost. In this paper we present the adversarial SSP model that also accounts for adversarial changes in the costs over time, while the underlying transition function remains unchanged. Formally, an agent interacts with an SSP environment for K episodes, the cost function changes arbitrarily between episodes, and the transitions are unknown to the agent. We develop the first algorithms for adversarial SSPs and prove high probability regret bounds of square-root K assuming all costs are strictly positive, and sub-linear regret in the general case. We are the first to consider this natural setting of adversarial SSP and obtain sub-linear regret for it.


Author(s):  
Peter Buchholz ◽  
Iryna Dohndorf

Abstract Stochastic shortest path problems (SSPPs) have many applications in practice and are subject of ongoing research for many years. This paper considers a variant of SSPPs where times or costs to pass an edge in a graph are, possibly correlated, random variables. There are two general goals one can aim for, the minimization of the expected costs to reach the destination or the maximization of the probability to reach the destination within a given budget. Often one is interested in policies that build a compromise between different goals which results in multi-objective problems. In this paper, an algorithm to compute the convex hull of Pareto optimal policies that consider expected costs and probabilities of falling below given budgets is developed. The approach uses the recently published class of PH-graphs that allow one to map SSPPs, even with generally distributed and correlated costs associated to edges, on Markov decision processes (MDPs) and apply the available techniques for MDPs to compute optimal policies.


Sign in / Sign up

Export Citation Format

Share Document