scholarly journals Markov decision processes with quasi-hyperbolic discounting

Author(s):  
Anna Jaśkiewicz ◽  
Andrzej S. Nowak

AbstractWe study Markov decision processes with Borel state spaces under quasi-hyperbolic discounting. This type of discounting nicely models human behaviour, which is time-inconsistent in the long run. The decision maker has preferences changing in time. Therefore, the standard approach based on the Bellman optimality principle fails. Within a dynamic game-theoretic framework, we prove the existence of randomised stationary Markov perfect equilibria for a large class of Markov decision processes with transitions having a density function. We also show that randomisation can be restricted to two actions in every state of the process. Moreover, we prove that under some conditions, this equilibrium can be replaced by a deterministic one. For models with countable state spaces, we establish the existence of deterministic Markov perfect equilibria. Many examples are given to illustrate our results, including a portfolio selection model with quasi-hyperbolic discounting.

Author(s):  
Tomáš Brázdil ◽  
Václav Brožek ◽  
Krishnendu Chatterjee ◽  
Vojtěch Forejt ◽  
Antonín Kučera

1979 ◽  
Vol 36 (8) ◽  
pp. 939-947 ◽  
Author(s):  
Roy Mendelssohn

Conditions are given that imply there exist policies that "minimize risk" of undesirable events for stochastic harvesting models. It is shown that for many problems, either such a policy will not exist, or else it is an "extreme" policy that is equally undesirable. Techniques are given to systematically trade-off decreases in the long-run expected return with decreases in the long-run risk. Several numerical examples are given for models of salmon runs, when both population-based risks and harvest-based risks are considered. Key words: Markov decision processes, risk, salmon management, Pareto optimal policies, trade-off curves, linear programing


Automatica ◽  
2020 ◽  
Vol 111 ◽  
pp. 108582 ◽  
Author(s):  
Eugene A. Feinberg ◽  
Anna Jaśkiewicz ◽  
Andrzej S. Nowak

Author(s):  
Pranav Ashok ◽  
Krishnendu Chatterjee ◽  
Przemysław Daca ◽  
Jan Křetínský ◽  
Tobias Meggendorfer

1970 ◽  
Vol 7 (3) ◽  
pp. 649-656 ◽  
Author(s):  
Sheldon M. Ross

The semi-Markov decision model is considered under the criterion of long-run average cost. A new criterion, which for any policy considers the limit of the expected cost incurred during the first n transitions divided by the expected length of the first n transitions, is considered. Conditions guaranteeing that an optimal stationary (non-randomized) policy exist are then presented. It is also shown that the above criterion is equivalent to the usual one under certain conditions.


Sign in / Sign up

Export Citation Format

Share Document