Reinforcement-learning optimal control for type-1 diabetes

(1) Background: People living with type 1 diabetes (T1D) require self-management to maintain blood glucose (BG) levels in a therapeutic range through the delivery of exogenous insulin. However, due to the various variability, uncertainty and complex glucose dynamics, optimizing the doses of insulin delivery to minimize the risk of hyperglycemia and hypoglycemia is still an open problem. (2) Methods: In this work, we propose a novel insulin bolus advisor which uses deep reinforcement learning (DRL) and continuous glucose monitoring to optimize insulin dosing at mealtime. In particular, an actor-critic model based on deep deterministic policy gradient is designed to compute mealtime insulin doses. The proposed system architecture uses a two-step learning framework, in which a population model is first obtained and then personalized by subject-specific data. Prioritized memory replay is adopted to accelerate the training process in clinical practice. To validate the algorithm, we employ a customized version of the FDA-accepted UVA/Padova T1D simulator to perform in silico trials on 10 adult subjects and 10 adolescent subjects. (3) Results: Compared to a standard bolus calculator as the baseline, the DRL insulin bolus advisor significantly improved the average percentage time in target range (70–180 mg/dL) from 74.1%±8.4% to 80.9%±6.9% (p<0.01) and 54.9%±12.4% to 61.6%±14.1% (p<0.01) in the the adult and adolescent cohorts, respectively, while reducing hypoglycemia. (4) Conclusions: The proposed algorithm has the potential to improve mealtime bolus insulin delivery in people with T1D and is a feasible candidate for future clinical validation.

Download Full-text

Optimal control of type 1 diabetes mellitus in youth receiving intensive treatment

Yearbook of Pediatrics ◽

10.1016/s0084-3954(08)70349-x ◽

2008 ◽

Vol 2008 ◽

pp. 128-130

Author(s):

J.A. Stockman

Keyword(s):

Diabetes Mellitus ◽

Optimal Control ◽

Type 1 Diabetes ◽

Type 1 Diabetes Mellitus ◽

Intensive Treatment

Download Full-text

Application of optimal control strategies for physiological model of type 1 diabetes - T1D

Communications in Mathematical Biology and Neuroscience ◽

10.28919/cmbn/4598 ◽

2020 ◽

Keyword(s):

Optimal Control ◽

Type 1 Diabetes ◽

Control Strategies ◽

Physiological Model

Download Full-text

Reinforcement Learning Based Method for Managing Type 1 Diabetes (Preprint)

10.2196/preprints.12905 ◽

2018 ◽

Author(s):

Mahsa Oroojeni Mohammad Javad ◽

Stephen Olusegun Agboola ◽

Kamal Jethwani ◽

Ibrahim Zeid ◽

Sagar Kamarthi

Keyword(s):

Type 1 Diabetes ◽

Reinforcement Learning ◽

Blood Glucose ◽

Blood Glucose Level ◽

Diabetic Patient ◽

Glucose Level ◽

Dosage Level ◽

Insulin Dosage ◽

Diabetic Patients

BACKGROUND Diabetes is a serious chronic disease marked by high levels of blood glucose. It results from issues related to how insulin is produced and/or how insulin functions in the body. In the long run, uncontrolled blood sugar can damage the vessels that supply blood to important organs such as heart, kidneys, eyes, and nerves. Currently there are no effective algorithms to automatically recommend insulin dosage level considering the characteristics of a diabetic patient. OBJECTIVE The objective of this work is to develop and validate a general reinforcement learning framework and a related learning model for personalized treatment and management of Type 1 diabetes and its complications. METHODS This research presents a model-free reinforcement learning (RL) algorithm to recommend insulin level to regulate the blood glucose level of a diabetic patient considering his/her state defined by A1C level, alcohol usage, activity level, and BMI value. In this approach, an RL agent learns from its exploration and response of diabetic patients when they are subject to different actions in terms of insulin dosage level. As a result of a treatment action at time step t, the RL agent receives a numeric reward depending on the response of the patient’s blood glucose level. At each stage the reward for the learning agent is calculated as a function of the difference between the glucose level in the patient body and its target level. The RL algorithm is trained on ten years of the clinical data of 87 patients obtained from the Mass General Hospital. Demographically, 59% of patients are male and 41% of patients are female; the median of age is 54 years and mean is 52.92 years; 86% of patients are white and 47% of 87 patients are married. RESULTS The performance of the algorithm is evaluated on 60 test cases. Further the performance of Support Vector Machine (SVM) has been applied for Lantus class prediction and results has been compared with Q-learning algorithm recommendation. The results show that the RL recommendations of insulin levels for test patients match with the actual prescriptions of the test patients. The RL gave prediction with an accuracy of 88% and SVM shows 80% accuracy. CONCLUSIONS Since the RL algorithm can select actions that improve patient condition by taking into account delayed effects, it has a good potential to control blood glucose level in diabetic patients.

Download Full-text

Neural inverse optimal control applied to type 1 diabetes mellitus patients

Analog Integrated Circuits and Signal Processing ◽

10.1007/s10470-013-0109-8 ◽

2013 ◽

Vol 76 (3) ◽

pp. 343-352 ◽

Cited By ~ 3

Author(s):

Blanca S. Leon ◽

Alma Y. Alanis ◽

Edgar N. Sanchez ◽

Fernando Ornelas-Tellez ◽

Eduardo Ruiz-Velazquez

Keyword(s):

Diabetes Mellitus ◽

Optimal Control ◽

Type 1 Diabetes ◽

Type 1 Diabetes Mellitus ◽

Inverse Optimal Control

Download Full-text

Basal Glucose Control in Type 1 Diabetes using Deep Reinforcement Learning: An In Silico Validation

IEEE Journal of Biomedical and Health Informatics ◽

10.1109/jbhi.2020.3014556 ◽

2020 ◽

pp. 1-1 ◽

Cited By ~ 1

Author(s):

Taiyu Zhu ◽

Kezhi Li ◽

Pau Herrero ◽

Pantelis Georgiou

Keyword(s):

Type 1 Diabetes ◽

Reinforcement Learning ◽

Glucose Control ◽

In Silico

Download Full-text

Neural Inverse Optimal Control via Passivity for Subcutaneous Blood Glucose Regulation in Type 1 Diabetes Mellitus Patients

Intelligent Automation & Soft Computing ◽

10.1080/10798587.2014.891307 ◽

2014 ◽

Vol 20 (2) ◽

pp. 279-295 ◽

Cited By ~ 4

Author(s):

Blanca S. Leon ◽

Alma Y. Alanis ◽

Edgar N. Sanchez ◽

Fernando Ornelas-Tellez ◽

Eduardo Ruiz-Velazquez

Keyword(s):

Diabetes Mellitus ◽

Optimal Control ◽

Type 1 Diabetes ◽

Blood Glucose ◽

Type 1 Diabetes Mellitus ◽

Glucose Regulation ◽

Inverse Optimal Control ◽

Blood Glucose Regulation

Download Full-text

A Reinforcement Learning–Based Method for Management of Type 1 Diabetes: Exploratory Study

JMIR Diabetes ◽

10.2196/12905 ◽

2019 ◽

Vol 4 (3) ◽

pp. e12905 ◽

Cited By ~ 2

Author(s):

Mahsa Oroojeni Mohammad Javad ◽

Stephen Olusegun Agboola ◽

Kamal Jethwani ◽

Abe Zeid ◽

Sagar Kamarthi

Keyword(s):

Type 1 Diabetes ◽

Reinforcement Learning ◽

Blood Glucose ◽

Blood Glucose Level ◽

Glucose Level ◽

Clinical Data ◽

Exploratory Study ◽

Insulin Dosage

Background Type 1 diabetes mellitus (T1DM) is characterized by chronic insulin deficiency and consequent hyperglycemia. Patients with T1DM require long-term exogenous insulin therapy to regulate blood glucose levels and prevent the long-term complications of the disease. Currently, there are no effective algorithms that consider the unique characteristics of T1DM patients to automatically recommend personalized insulin dosage levels. Objective The objective of this study was to develop and validate a general reinforcement learning (RL) framework for the personalized treatment of T1DM using clinical data. Methods This research presents a model-free data-driven RL algorithm, namely Q-learning, that recommends insulin doses to regulate the blood glucose level of a T1DM patient, considering his or her state defined by glycated hemoglobin (HbA1c) levels, body mass index, engagement in physical activity, and alcohol usage. In this approach, the RL agent identifies the different states of the patient by exploring the patient’s responses when he or she is subjected to varying insulin doses. On the basis of the result of a treatment action at time step t, the RL agent receives a numeric reward, positive or negative. The reward is calculated as a function of the difference between the actual blood glucose level achieved in response to the insulin dose and the targeted HbA1c level. The RL agent was trained on 10 years of clinical data of patients treated at the Mass General Hospital. Results A total of 87 patients were included in the training set. The mean age of these patients was 53 years, 59% (51/87) were male, 86% (75/87) were white, and 47% (41/87) were married. The performance of the RL agent was evaluated on 60 test cases. RL agent–recommended insulin dosage interval includes the actual dose prescribed by the physician in 53 out of 60 cases (53/60, 88%). Conclusions This exploratory study demonstrates that an RL algorithm can be used to recommend personalized insulin doses to achieve adequate glycemic control in patients with T1DM. However, further investigation in a larger sample of patients is needed to confirm these findings.

Download Full-text

Quasi Model Based Optimal Control of Type 1 Diabetes Mellitus*

IFAC Proceedings Volumes ◽

10.3182/20110828-6-it-1002.02859 ◽

2011 ◽

Vol 44 (1) ◽

pp. 5012-5017

Author(s):

András György ◽

Péter Szalay ◽

Dániel A. Drexler ◽

Balázs Benyó ◽

Zoltán Benyó ◽

...

Keyword(s):

Diabetes Mellitus ◽

Optimal Control ◽

Type 1 Diabetes ◽

Type 1 Diabetes Mellitus ◽

Model Based

Download Full-text

Controlling Blood Glucose For Patients With Type 1 DiabetesUsing Deep Reinforcement Learning – The Influence OfChanging The Reward Function

Proceedings of the Northern Lights Deep Learning Workshop ◽

10.7557/18.5166 ◽

2020 ◽

Vol 1 ◽

pp. 6

Author(s):

Miguel Angel Tejedor Hernandez ◽

Jonas Nordhaug Myhre

Keyword(s):

Type 1 Diabetes ◽

Reinforcement Learning ◽

Blood Glucose ◽

In Silico ◽

Critical Component ◽

Promising Direction ◽

Reward Function ◽

Reward Functions

Reinforcement learning (RL) is a promising direction in adaptive and personalized type 1 diabetes (T1D) treatment. However, the reward function – a most critical component in RL – is a component that is in most cases hand designed and often overlooked. In this paper we show that different reward functions can dramatically influence the final result when using RL to treat in-silico T1D patients.

Download Full-text