markov decision processes Latest Research Papers

AbstractReinforcement Learning (RL) has emerged as an effective approach to address a variety of complex control tasks. In a typical RL problem, an agent interacts with the environment by perceiving observations and performing actions, with the ultimate goal of maximizing the cumulative reward. In the traditional formulation, the environment is assumed to be a fixed entity that cannot be externally controlled. However, there exist several real-world scenarios in which the environment offers the opportunity to configure some of its parameters, with diverse effects on the agent’s learning process. In this contribution, we provide an overview of the main aspects of environment configurability. We start by introducing the formalism of the Configurable Markov Decision Processes (Conf-MDPs) and we illustrate the solutions concepts. Then, we revise the algorithms for solving the learning problem in Conf-MDPs. Finally, we present two applications of Conf-MDPs: policy space identification and control frequency adaptation.

Opportunistic Multi-robot Environmental Sampling via Decentralized Markov Decision Processes

Distributed Autonomous Robotic Systems - Springer Proceedings in Advanced Robotics ◽

10.1007/978-3-030-92790-5_13 ◽

2022 ◽

pp. 163-175

Author(s):

Ayan Dutta ◽

O. Patrick Kreidl ◽

Jason M. O’Kane

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Environmental Sampling ◽

Markov Decision ◽

Multi Robot

Optimal inspection and maintenance planning for deteriorating structural components through dynamic Bayesian networks and Markov decision processes

Structural Safety ◽

10.1016/j.strusafe.2021.102140 ◽

2022 ◽

Vol 94 ◽

pp. 102140

Author(s):

P.G. Morato ◽

K.G. Papakonstantinou ◽

C.P. Andriotis ◽

J.S. Nielsen ◽

P. Rigo

Keyword(s):

Bayesian Networks ◽

Markov Decision Processes ◽

Dynamic Bayesian Networks ◽

Decision Processes ◽

Maintenance Planning ◽

Structural Components ◽

Markov Decision ◽

Inspection And Maintenance

Continuous-time zero-sum games for markov decision processes with discounted risk-sensitive cost criterion on a general state space

Stochastic Analysis and Applications ◽

10.1080/07362994.2021.2013889 ◽

2021 ◽

pp. 1-31

Author(s):

Subrata Golui ◽

Chandan Pal

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Decision Processes ◽

General State ◽

Zero Sum Games ◽

Cost Criterion ◽

Risk Sensitive ◽

Markov Decision ◽

General State Space ◽

Zero Sum

Solving constrained K-Markov decision processes

10.36334/modsim.2021.i2.ferrermestres ◽

2021 ◽

Keyword(s):

Markov Decision Processes ◽

Decision Processes ◽

Markov Decision

On Linear Programming for Constrained and Unconstrained Average-Cost Markov Decision Processes with Countable Action Spaces and Strictly Unbounded Costs

Mathematics of Operations Research ◽

10.1287/moor.2021.1177 ◽

2021 ◽

Author(s):

Huizhen Yu

Keyword(s):

Linear Programming ◽

Markov Decision Processes ◽

Average Cost ◽

Decision Processes ◽

The State ◽

Programming Approach ◽

One Stage ◽

Markov Decision ◽

Action Spaces ◽

Borel Measurable

We consider the linear programming approach for constrained and unconstrained Markov decision processes (MDPs) under the long-run average-cost criterion, where the class of MDPs in our study have Borel state spaces and discrete countable action spaces. Under a strict unboundedness condition on the one-stage costs and a recently introduced majorization condition on the state transition stochastic kernel, we study infinite-dimensional linear programs for the average-cost MDPs and prove the absence of a duality gap and other optimality results. Our results do not require a lower-semicontinuous MDP model. Thus, they can be applied to countable action space MDPs where the dynamics and one-stage costs are discontinuous in the state variable. Our proofs make use of the continuity property of Borel measurable functions asserted by Lusin’s theorem.