Solving infinite horizon discounted Markov decision process problems for a range of discount factors

Title: Sequential Decision Making Using Quantiles The goal of a traditional Markov decision process (MDP) is to maximize the expectation of cumulative reward over a finite or infinite horizon. In many applications, however, a decision maker may be interested in optimizing a specific quantile of the cumulative reward. For example, a physician may want to determine the optimal drug regime for a risk-averse patient with the objective of maximizing the 0.10 quantile of the cumulative reward; this is the cumulative improvement in health that is expected to occur with at least 90% probability for the patient. In “Quantile Markov Decision Processes,” X. Li, H. Zhong, and M. Brandeau provide analytic results to solve the quantile Markov decision process (QMDP) problem. They develop an efficient dynamic programming procedure that finds the optimal QMDP value function for all states and quantiles in one pass. The algorithm also extends to the MDP problem with a conditional value-at-risk objective.

Download Full-text

Correction to: Solving an Infinite-Horizon Discounted Markov Decision Process by DC Programming and DCA

Advanced Computational Methods for Knowledge Engineering - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-319-38884-7_21 ◽

2019 ◽

pp. C1-C1

Author(s):

Vinh Thanh Ho ◽

Hoai An Le Thi

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Infinite Horizon ◽

Dc Programming ◽

Markov Decision

Download Full-text

Infinite-Horizon Policy-Gradient Estimation with Variable Discount Factor for Markov Decision Process

2008 3rd International Conference on Innovative Computing Information and Control ◽

10.1109/icicic.2008.318 ◽

2008 ◽

Cited By ~ 1

Author(s):

Bing-Kun Bao ◽

Bao-Qun Yin ◽

Hong-Sheng Xi

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Infinite Horizon ◽

Discount Factor ◽

Gradient Estimation ◽

Policy Gradient ◽

Markov Decision

Download Full-text

Infinite-horizon average-cost Markov decision process routing games

2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC) ◽

10.1109/itsc.2017.8317849 ◽

2017 ◽

Cited By ~ 2

Author(s):

Dan Calderone ◽

S. Shankar

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Average Cost ◽

Infinite Horizon ◽

Markov Decision ◽

Routing Games

Download Full-text

A Markov Decision Process Approach for Cost-Benefit Analysis of Infrastructure Resilience Upgrades

SSRN Electronic Journal ◽

10.2139/ssrn.3657479 ◽

2020 ◽

Author(s):

Qianru Zhu ◽

Benjamin D. Leibowicz

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Cost Benefit Analysis ◽

Cost Benefit ◽

Process Approach ◽

Benefit Analysis ◽

Markov Decision ◽

Infrastructure Resilience

Download Full-text

A Markov Decision Process Workflow for Automating Interior Design

KSCE Journal of Civil Engineering ◽

10.1007/s12205-021-1272-6 ◽

2021 ◽

Author(s):

Ebrahim Karan ◽

Sadegh Asgari ◽

Abbas Rashidi

Keyword(s):

Markov Decision Process ◽

Interior Design ◽

Decision Process ◽

Markov Decision

Download Full-text

A constraint partially observable semi-Markov decision process for the attack–defence relationships in various critical infrastructures

Cyber-Physical Systems ◽

10.1080/23335777.2021.1879935 ◽

2021 ◽

pp. 1-26

Author(s):

Nadia Niknami ◽

Jie Wu

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Critical Infrastructures ◽

Markov Decision ◽

Partially Observable

Download Full-text

Development of a Shipment Policy for Collection Centers

Mathematics ◽

10.3390/math9121385 ◽

2021 ◽

Vol 9 (12) ◽

pp. 1385

Author(s):

Irais Mora-Ochomogo ◽

Marco Serrato ◽

Jaime Mora-Vargas ◽

Raha Akhavan-Tabatabaei

Keyword(s):

Climate Change ◽

Natural Disasters ◽

Markov Decision Process ◽

Decision Process ◽

Necessary Conditions ◽

Decision Makers ◽

Humanitarian Organizations ◽

The World ◽

Markov Decision ◽

Unsatisfied Demand

Natural disasters represent a latent threat for every country in the world. Due to climate change and other factors, statistics show that they continue to be on the rise. This situation presents a challenge for the communities and the humanitarian organizations to be better prepared and react faster to natural disasters. In some countries, in-kind donations represent a high percentage of the supply for the operations, which presents additional challenges. This research proposes a Markov Decision Process (MDP) model to resemble operations in collection centers, where in-kind donations are received, sorted, packed, and sent to the affected areas. The decision addressed is when to send a shipment considering the uncertainty of the donations’ supply and the demand, as well as the logistics costs and the penalty of unsatisfied demand. As a result of the MDP a Monotone Optimal Non-Decreasing Policy (MONDP) is proposed, which provides valuable insights for decision-makers within this field. Moreover, the necessary conditions to prove the existence of such MONDP are presented.

Download Full-text