Sequential Decision Making Using Q Learning Algorithm for Diabetic Patients

Author(s):  
Pramod Patil ◽  
Parag Kulkarni ◽  
Rachana Shirsath
Author(s):  
Hoda Heidari ◽  
Andreas Krause

We study fairness in sequential decision making environments, where at each time step a learning algorithm receives data corresponding to a new individual (e.g. a new job application) and must make an irrevocable decision about him/her (e.g. whether to hire the applicant) based on observations made so far. In order to prevent cases of disparate treatment, our time-dependent notion of fairness requires algorithmic decisions to be consistent: if two individuals are similar in the feature space and arrive during the same time epoch, the algorithm must assign them to similar outcomes. We propose a general framework for post-processing predictions made by a black-box learning model, that guarantees the resulting sequence of outcomes is consistent. We show theoretically that imposing consistency will not significantly slow down learning. Our experiments on two real-world data sets illustrate and confirm this finding in practice.


2019 ◽  
Vol 16 (3) ◽  
pp. 172988141985318
Author(s):  
Zhenhai Gao ◽  
Tianjun Sun ◽  
Hongwei Xiao

In the development of autonomous driving, decision-making has become one of the technical difficulties. Traditional rule-based decision-making methods lack adaptive capacity when dealing with unfamiliar and complex traffic conditions. However, reinforcement learning shows the potential to solve sequential decision problems. In this article, an independent decision-making method based on reinforcement Q-learning is proposed. First, a Markov decision process model is established by analysis of car-following. Then, the state set and action set are designed by the synthesized consideration of driving simulator experimental results and driving risk principles. Furthermore, the reinforcement Q-learning algorithm is developed mainly based on the reward function and update function. Finally, the feasibility is verified through random simulation tests, and the improvement is made by comparative analysis with a traditional method.


Author(s):  
Andreas A. Malikopoulos ◽  
Panos Y. Papalambros ◽  
Dennis N. Assanis

Modeling dynamic systems incurring stochastic disturbances for deriving a control policy is a ubiquitous task in engineering. However, in some instances obtaining a model of a system may be impractical or impossible. Alternative approaches have been developed using a simulation-based stochastic framework, in which the system interacts with its environment in real time and obtains information that can be processed to produce an optimal control policy. In this context, the problem of developing a policy for controlling the system’s behavior is formulated as a sequential decision-making problem under uncertainty. This paper considers real-time sequential decision-making under uncertainty modeled as a Markov Decision Process (MDP). A state-space representation model is constructed through a learning mechanism and is used to improve system performance over time. The model allows decision making based on gradually enhanced knowledge of system response as it transitions from one state to another, in conjunction with actions taken at each state. A learning algorithm is implemented realizing in real time the optimal control policy associated with the state transitions. The proposed method is demonstrated on the single cart-pole balancing problem and a vehicle cruise control problem.


Author(s):  
Ming-Sheng Ying ◽  
Yuan Feng ◽  
Sheng-Gang Ying

AbstractMarkov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.


Sign in / Sign up

Export Citation Format

Share Document