First Passage Optimality for Continuous-Time Markov Decision Processes With Varying Discount Factors and History-Dependent Policies

This paper deals with the first passage optimality and variance minimisation problems of discrete-time Markov decision processes (MDPs) with varying discount factors and unbounded rewards/costs. First, under suitable conditions slightly weaker than those in the previous literature on the standard (infinite horizon) discounted MDPs, we establish the existence and characterisation of the first passage expected-optimal stationary policies. Second, to further distinguish the expected-optimal stationary policies, we introduce the variance minimisation problem, prove that it is equivalent to a new first passage optimality problem of MDPs, and, thus, show the existence of a variance-optimal policy that minimises the variance over the set of all first passage expected-optimal stationary policies. Finally, we use a computable example to illustrate our main results and also to show the difference between the first passage optimality here and the standard discount optimality of MDPs in the previous literature.

Download Full-text

First Passage Optimality and Variance Minimisation of Markov Decision Processes with Varying Discount Factors

Journal of Applied Probability ◽

10.1017/s0021900200012560 ◽

2015 ◽

Vol 52 (02) ◽

pp. 441-456 ◽

Cited By ~ 1

Author(s):

Xiao Wu ◽

Xianping Guo

Keyword(s):

Markov Decision Processes ◽

Discrete Time ◽

Infinite Horizon ◽

Decision Processes ◽

Previous Literature ◽

First Passage ◽

Discount Factors ◽

Markov Decision ◽

The Difference ◽

Minimisation Problem

This paper deals with the first passage optimality and variance minimisation problems of discrete-time Markov decision processes (MDPs) with varying discount factors and unbounded rewards/costs. First, under suitable conditions slightly weaker than those in the previous literature on the standard (infinite horizon) discounted MDPs, we establish the existence and characterisation of the first passage expected-optimal stationary policies. Second, to further distinguish the expected-optimal stationary policies, we introduce the variance minimisation problem, prove that it is equivalent to a new first passage optimality problem of MDPs, and, thus, show the existence of a variance-optimal policy that minimises the variance over the set of all first passage expected-optimal stationary policies. Finally, we use a computable example to illustrate our main results and also to show the difference between the first passage optimality here and the standard discount optimality of MDPs in the previous literature.

Download Full-text

On the First Passage $g$-Mean-Variance Optimality for Discounted Continuous-Time Markov Decision Processes

SIAM Journal on Control and Optimization ◽

10.1137/140968872 ◽

2015 ◽

Vol 53 (3) ◽

pp. 1406-1424 ◽

Cited By ~ 6

Author(s):

Xianping Guo ◽

Xiangxiang Huang ◽

Yi Zhang

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Decision Processes ◽

First Passage ◽

Markov Decision ◽

Mean Variance

Download Full-text

Approximate Optimal Cost and Policies of First Passage Markov Decision Processes with Countable-State Space and Discount Factors

Proceedings of IncoME-V & CEPE Net-2020 - Mechanisms and Machine Science ◽

10.1007/978-3-030-75793-9_5 ◽

2021 ◽

pp. 39-49

Author(s):

Xiao Wu ◽

Yanqiu Tang

Keyword(s):

State Space ◽

Markov Decision Processes ◽

Decision Processes ◽

First Passage ◽

Countable State Space ◽

Optimal Cost ◽

Discount Factors ◽

Countable State ◽

Markov Decision

Download Full-text

Denumerable state continuous time Markov decision processes with unbounded cost and transition rates under average criterion

The ANZIAM Journal ◽

10.1017/s144618110001213x ◽

2002 ◽

Vol 43 (4) ◽

pp. 541-557 ◽

Cited By ~ 10

Author(s):

Xianping Guo ◽

Weiping Zhu

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Decision Processes ◽

Transition Rates ◽

Birth And Death Processes ◽

Optimality Equation ◽

Average Criterion ◽

Markov Decision ◽

Unbounded Cost ◽

Queue Model

AbstractIn this paper, we consider denumerable state continuous time Markov decision processes with (possibly unbounded) transition and cost rates under average criterion. We present a set of conditions and prove the existence of both average cost optimal stationary policies and a solution of the average optimality equation under the conditions. The results in this paper are applied to an admission control queue model and controlled birth and death processes.

Download Full-text