A Real-Time Computational Learning Model for Sequential Decision-Making Problems Under Uncertainty

Modeling dynamic systems incurring stochastic disturbances for deriving a control policy is a ubiquitous task in engineering. However, in some instances obtaining a model of a system may be impractical or impossible. Alternative approaches have been developed using a simulation-based stochastic framework, in which the system interacts with its environment in real time and obtains information that can be processed to produce an optimal control policy. In this context, the problem of developing a policy for controlling the system’s behavior is formulated as a sequential decision-making problem under uncertainty. This paper considers the problem of deriving a control policy for a dynamic system with unknown dynamics in real time, formulated as a sequential decision-making under uncertainty. The evolution of the system is modeled as a controlled Markov chain. A new state-space representation model and a learning mechanism are proposed that can be used to improve system performance over time. The major difference between the existing methods and the proposed learning model is that the latter utilizes an evaluation function, which considers the expected cost that can be achieved by state transitions forward in time. The model allows decision-making based on gradually enhanced knowledge of system response as it transitions from one state to another, in conjunction with actions taken at each state. The proposed model is demonstrated on the single cart-pole balancing problem and a vehicle cruise-control problem.

Download Full-text

A State-Space Representation Model and Learning Algorithm for Real-Time Decision-Making Under Uncertainty

Volume 9: Mechanical Systems and Control, Parts A, B, and C ◽

10.1115/imece2007-41258 ◽

2007 ◽

Cited By ~ 4

Author(s):

Andreas A. Malikopoulos ◽

Panos Y. Papalambros ◽

Dennis N. Assanis

Keyword(s):

Decision Making ◽

Optimal Control ◽

Real Time ◽

Learning Algorithm ◽

Control Policy ◽

Sequential Decision Making ◽

Space Representation ◽

Decision Making Under Uncertainty ◽

Sequential Decision ◽

State Space Representation

Modeling dynamic systems incurring stochastic disturbances for deriving a control policy is a ubiquitous task in engineering. However, in some instances obtaining a model of a system may be impractical or impossible. Alternative approaches have been developed using a simulation-based stochastic framework, in which the system interacts with its environment in real time and obtains information that can be processed to produce an optimal control policy. In this context, the problem of developing a policy for controlling the system’s behavior is formulated as a sequential decision-making problem under uncertainty. This paper considers real-time sequential decision-making under uncertainty modeled as a Markov Decision Process (MDP). A state-space representation model is constructed through a learning mechanism and is used to improve system performance over time. The model allows decision making based on gradually enhanced knowledge of system response as it transitions from one state to another, in conjunction with actions taken at each state. A learning algorithm is implemented realizing in real time the optimal control policy associated with the state transitions. The proposed method is demonstrated on the single cart-pole balancing problem and a vehicle cruise control problem.

Download Full-text

Imaginative Reinforcement Learning: Computational Principles and Neural Mechanisms

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_01170 ◽

2017 ◽

Vol 29 (12) ◽

pp. 2103-2113 ◽

Cited By ~ 8

Author(s):

Samuel J. Gershman ◽

Jimmy Zhou ◽

Cody Kommers

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Internal Model ◽

Learning Model ◽

Neural Mechanisms ◽

Sequential Decision Making ◽

Sequential Decision ◽

The Impact ◽

To Receive ◽

Reinforcement Learning Model

Imagination enables us not only to transcend reality but also to learn about it. In the context of reinforcement learning, an agent can rationally update its value estimates by simulating an internal model of the environment, provided that the model is accurate. In a series of sequential decision-making experiments, we investigated the impact of imaginative simulation on subsequent decisions. We found that imagination can cause people to pursue imagined paths, even when these paths are suboptimal. This bias is systematically related to participants' optimism about how much reward they expect to receive along imagined paths; providing feedback strongly attenuates the effect. The imagination effect can be captured by a reinforcement learning model that includes a bonus added onto imagined rewards. Using fMRI, we show that a network of regions associated with valuation is predictive of the imagination effect. These results suggest that imagination, although a powerful tool for learning, is also susceptible to motivational biases.

Download Full-text

Sequential decision-making in healthcare IoT: Real-time health monitoring, treatments and interventions

2016 IEEE 3rd World Forum on Internet of Things (WF-IoT) ◽

10.1109/wf-iot.2016.7845446 ◽

2016 ◽

Cited By ~ 5

Author(s):

Daphney-Stavroula Zois

Keyword(s):

Decision Making ◽

Real Time ◽

Health Monitoring ◽

Sequential Decision Making ◽

Sequential Decision

Download Full-text

Supplemental Material for Melioration as Rational Choice: Sequential Decision Making in Uncertain Environments

Psychological Review ◽

10.1037/a0030850.supp ◽

2012 ◽

Keyword(s):

Decision Making ◽

Rational Choice ◽

Sequential Decision Making ◽

Sequential Decision ◽

Uncertain Environments

Download Full-text

The Contrasting Effects of Perceived Control: Implications for Sequential Decision-Making

PsycEXTRA Dataset ◽

10.1037/e504392014-030 ◽

2013 ◽

Author(s):

Maggie Y. Chu ◽

Robert S. Wyer ◽

Lisa C. Wan

Keyword(s):

Decision Making ◽

Perceived Control ◽

Sequential Decision Making ◽

Sequential Decision

Download Full-text

Human and Optimal Valuation in a Sequential Decision-Making Task With Uncertainty

PsycEXTRA Dataset ◽

10.1037/e527342012-505 ◽

2007 ◽

Author(s):

Kyler M. Eastman ◽

Brian J. Stankiewicz ◽

Alex C. Huk

Keyword(s):

Decision Making ◽

Sequential Decision Making ◽

Sequential Decision

Download Full-text

Losing a dime with a satisfied mind: Positive affect accounts for age-related differences in sequential decision making

PsycEXTRA Dataset ◽

10.1037/e615882011-002 ◽

2009 ◽

Author(s):

Bettina von Helversen ◽

Rui Mata

Keyword(s):

Decision Making ◽

Positive Affect ◽

Sequential Decision Making ◽

Sequential Decision ◽

Age Related

Download Full-text

Stopping Policies in Sequential Decision Making

PsycEXTRA Dataset ◽

10.1037/e722982011-071 ◽

1993 ◽

Author(s):

Gad Saad ◽

J. Edward Russo

Keyword(s):

Decision Making ◽

Sequential Decision Making ◽

Sequential Decision

Download Full-text

Introduction to Sequential Decision-Making

Reciprocity, Evolution, and Decision Games in Network and Data Science ◽

10.1017/9781108859783.015 ◽

2021 ◽

pp. 249-252

Keyword(s):

Decision Making ◽

Sequential Decision Making ◽

Sequential Decision

Download Full-text

Optimal Policies for Quantum Markov Decision Processes

International Journal of Automation and Computing ◽

10.1007/s11633-021-1278-z ◽

2021 ◽

Author(s):

Ming-Sheng Ying ◽

Yuan Feng ◽

Sheng-Gang Ying

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Quantum Systems ◽

Sequential Decision Making ◽

Mathematical Framework ◽

Sequential Decision ◽

Learning Techniques ◽

Optimal Policies ◽

Markov Decision ◽

Programming Algorithms

AbstractMarkov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.

Download Full-text