The Linear Programming Approach to Reach-Avoid Problems for Markov Decision Processes

The Expected Total Cost Criterion for Markov Decision Processes under Constraints

Advances in Applied Probability ◽

10.1239/aap/1377868541 ◽

2013 ◽

Vol 45 (3) ◽

pp. 837-859 ◽

Cited By ~ 5

Author(s):

François Dufour ◽

A. B. Piunovskiy

Keyword(s):

Markov Decision Processes ◽

Optimal Solution ◽

Decision Processes ◽

Linear Program ◽

Programming Approach ◽

Stationary Policy ◽

Total Cost ◽

Optimal Value ◽

Markov Decision ◽

Expected Total Cost

In this work, we study discrete-time Markov decision processes (MDPs) with constraints when all the objectives have the same form of expected total cost over the infinite time horizon. Our objective is to analyze this problem by using the linear programming approach. Under some technical hypotheses, it is shown that if there exists an optimal solution for the associated linear program then there exists a randomized stationary policy which is optimal for the MDP, and that the optimal value of the linear program coincides with the optimal value of the constrained control problem. A second important result states that the set of randomized stationary policies provides a sufficient set for solving this MDP. It is important to note that, in contrast with the classical results of the literature, we do not assume the MDP to be transient or absorbing. More importantly, we do not impose the cost functions to be nonnegative or to be bounded below. Several examples are presented to illustrate our results.

Download Full-text

The Expected Total Cost Criterion for Markov Decision Processes under Constraints

Advances in Applied Probability ◽

10.1017/s0001867800006601 ◽

2013 ◽

Vol 45 (03) ◽

pp. 837-859 ◽

Cited By ~ 1

Author(s):

François Dufour ◽

A. B. Piunovskiy

Keyword(s):

Markov Decision Processes ◽

Optimal Solution ◽

Decision Processes ◽

Linear Program ◽

Programming Approach ◽

Stationary Policy ◽

Total Cost ◽

Optimal Value ◽

Markov Decision ◽

Expected Total Cost

In this work, we study discrete-time Markov decision processes (MDPs) with constraints when all the objectives have the same form of expected total cost over the infinite time horizon. Our objective is to analyze this problem by using the linear programming approach. Under some technical hypotheses, it is shown that if there exists an optimal solution for the associated linear program then there exists a randomized stationary policy which is optimal for the MDP, and that the optimal value of the linear program coincides with the optimal value of the constrained control problem. A second important result states that the set of randomized stationary policies provides a sufficient set for solving this MDP. It is important to note that, in contrast with the classical results of the literature, we do not assume the MDP to be transient or absorbing. More importantly, we do not impose the cost functions to be nonnegative or to be bounded below. Several examples are presented to illustrate our results.

Download Full-text

A Convex Programming Approach for Discrete-Time Markov Decision Processes under the Expected Total Reward Criterion

SIAM Journal on Control and Optimization ◽

10.1137/19m1255811 ◽

2020 ◽

Vol 58 (4) ◽

pp. 2535-2566

Author(s):

François Dufour ◽

Alexandre Genadot

Keyword(s):

Convex Programming ◽

Markov Decision Processes ◽

Discrete Time ◽

Decision Processes ◽

Programming Approach ◽

Total Reward ◽

Markov Decision ◽

Reward Criterion

Download Full-text

DUAL FORMULATION OF OPTIMAL PROBLEM FOR CONTINUOUS-TIME SYSTEMS

International Journal of Information Acquisition ◽

10.1142/s0219878905000611 ◽

2005 ◽

Vol 02 (03) ◽

pp. 251-258

Author(s):

HANLIN HE ◽

QIAN WANG ◽

XIAOXIN LIAO

Keyword(s):

Objective Function ◽

Continuous Time ◽

Dimensional Space ◽

Finite Dimensional Space ◽

Error Response ◽

Infinite Dimensional ◽

Dual Formulation ◽

Finite Dimensional ◽

Continuous Time Systems ◽

Time Systems

The dual formulation of the maximal-minimal problem for an objective function of the error response to a fixed input in the continuous-time systems is given by a result of Fenchel dual. This formulation probably changes the original problem in the infinite dimensional space into the maximal problem with some restrained conditions in the finite dimensional space, which can be researched by finite dimensional space theory. When the objective function is given by the norm of the error response, the maximum of the error response or minimum of the error response, the dual formulation for the problems of L1-optimal control, the minimum of maximal error response, and the minimal overshoot etc. can be obtained, which gives a method for studying these problems.

Download Full-text

Constrained Markov decision processes with total cost criteria: Lagrangian approach and dual linear program

Mathematical Methods of Operations Research ◽

10.1007/s001860050035 ◽

1998 ◽

Vol 48 (3) ◽

pp. 387-417 ◽

Cited By ~ 6

Author(s):

Eitan Altman

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Linear Program ◽

Lagrangian Approach ◽

Total Cost ◽

Constrained Markov Decision Processes ◽

Markov Decision

Download Full-text

The Expected Total Cost Criterion for Markov Decision Processes under Constraints: A Convex Analytic Approach

Advances in Applied Probability ◽

10.1239/aap/1346955264 ◽

2012 ◽

Vol 44 (3) ◽

pp. 774-793 ◽

Cited By ~ 4

Author(s):

François Dufour ◽

M. Horiguchi ◽

A. B. Piunovskiy

Keyword(s):

Optimal Control ◽

Markov Decision Processes ◽

Optimal Solution ◽

Decision Processes ◽

Linear Program ◽

Occupation Measure ◽

Stationary Policy ◽

Total Cost ◽

Markov Decision ◽

Expected Total Cost

This paper deals with discrete-time Markov decision processes (MDPs) under constraints where all the objectives have the same form of expected total cost over the infinite time horizon. The existence of an optimal control policy is discussed by using the convex analytic approach. We work under the assumptions that the state and action spaces are general Borel spaces, and that the model is nonnegative, semicontinuous, and there exists an admissible solution with finite cost for the associated linear program. It is worth noting that, in contrast to the classical results in the literature, our hypotheses do not require the MDP to be transient or absorbing. Our first result ensures the existence of an optimal solution to the linear program given by an occupation measure of the process generated by a randomized stationary policy. Moreover, it is shown that this randomized stationary policy provides an optimal solution to this Markov control problem. As a consequence, these results imply that the set of randomized stationary policies is a sufficient set for this optimal control problem. Finally, our last main result states that all optimal solutions of the linear program coincide on a special set with an optimal occupation measure generated by a randomized stationary policy. Several examples are presented to illustrate some theoretical issues and the possible applications of the results developed in the paper.

Download Full-text

A Linearly Relaxed Approximate Linear Program for Markov Decision Processes

IEEE Transactions on Automatic Control ◽

10.1109/tac.2017.2743163 ◽

2018 ◽

Vol 63 (4) ◽

pp. 1185-1191 ◽

Cited By ~ 1

Author(s):

Chandrashekar Lakshminarayanan ◽

Shalabh Bhatnagar ◽

Csaba Szepesvari

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Linear Program ◽

Markov Decision

Download Full-text

Discounted continuous-time Markov decision processes with unbounded rates and randomized history-dependent policies: the dynamic programming approach

4OR ◽

10.1007/s10288-013-0236-1 ◽

2013 ◽

Vol 12 (1) ◽

pp. 49-75 ◽

Cited By ~ 10

Author(s):

Alexey Piunovskiy ◽

Yi Zhang

Keyword(s):

Dynamic Programming ◽

Markov Decision Processes ◽

Continuous Time ◽

Decision Processes ◽

Programming Approach ◽

Dynamic Programming Approach ◽

Markov Decision

Download Full-text

A mathematical programming approach to a problem in variance penalised Markov decision processes

OR Spectrum ◽

10.1007/bf01719453 ◽

1994 ◽

Vol 15 (4) ◽

pp. 225-230 ◽

Cited By ~ 6

Author(s):

D. J. White

Keyword(s):

Mathematical Programming ◽

Markov Decision Processes ◽

Decision Processes ◽

Programming Approach ◽

Markov Decision ◽

Mathematical Programming Approach

Download Full-text

DIFFERENCE-DIFFERENTIAL GAME OF CONVERGENCE - EVASION IN HILBERT SPACE, II

Vestnik of Samara University Natural Science Series ◽

10.18287/2541-7525-2014-20-10-74-83 ◽

2017 ◽

Vol 20 (10) ◽

pp. 74-83

Author(s):

V.L. Pasikov

Keyword(s):

Hilbert Space ◽

Dimensional Space ◽

Dynamic Game ◽

Continuous Functions ◽

Scientific School ◽

Finite Dimensional Space ◽

Space Of Continuous Functions ◽

Infinite Dimensional ◽

Finite Dimensional ◽

The Right

For conﬂict operated diﬀerential system with delay studying of dynamic game of convergence - evasion relatively functional goal set, now regarding evasion and solution of a problem of existence of alternative in the case under consideration is continued. In the work realization of condition of saddle point relatively to the right part of operated system is not supposed. Earlier similar tasks were set and solved for ﬁnite-dimensional space at scientiﬁc school of the academicianN.N. Krasovsky. For a case of inﬁnite-dimensional space of continuous functions similar tasks were considered by the author. In the suggested work at theorem proving about convergence - evasion, the norm of Hilbert space is used.

Download Full-text