dynamic treatment regime
Recently Published Documents


TOTAL DOCUMENTS

11
(FIVE YEARS 5)

H-INDEX

2
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Stav Belogolovsky ◽  
Philip Korsunsky ◽  
Shie Mannor ◽  
Chen Tessler ◽  
Tom Zahavy

AbstractWe consider the task of Inverse Reinforcement Learning in Contextual Markov Decision Processes (MDPs). In this setting, contexts, which define the reward and transition kernel, are sampled from a distribution. In addition, although the reward is a function of the context, it is not provided to the agent. Instead, the agent observes demonstrations from an optimal policy. The goal is to learn the reward mapping, such that the agent will act optimally even when encountering previously unseen contexts, also known as zero-shot transfer. We formulate this problem as a non-differential convex optimization problem and propose a novel algorithm to compute its subgradients. Based on this scheme, we analyze several methods both theoretically, where we compare the sample complexity and scalability, and empirically. Most importantly, we show both theoretically and empirically that our algorithms perform zero-shot transfer (generalize to new and unseen contexts). Specifically, we present empirical experiments in a dynamic treatment regime, where the goal is to learn a reward function which explains the behavior of expert physicians based on recorded data of them treating patients diagnosed with sepsis.


Biostatistics ◽  
2018 ◽  
Vol 21 (3) ◽  
pp. 432-448 ◽  
Author(s):  
William J Artman ◽  
Inbal Nahum-Shani ◽  
Tianshuang Wu ◽  
James R Mckay ◽  
Ashkan Ertefaie

Summary Sequential, multiple assignment, randomized trial (SMART) designs have become increasingly popular in the field of precision medicine by providing a means for comparing more than two sequences of treatments tailored to the individual patient, i.e., dynamic treatment regime (DTR). The construction of evidence-based DTRs promises a replacement to ad hoc one-size-fits-all decisions pervasive in patient care. However, there are substantial statistical challenges in sizing SMART designs due to the correlation structure between the DTRs embedded in the design (EDTR). Since a primary goal of SMARTs is the construction of an optimal EDTR, investigators are interested in sizing SMARTs based on the ability to screen out EDTRs inferior to the optimal EDTR by a given amount which cannot be done using existing methods. In this article, we fill this gap by developing a rigorous power analysis framework that leverages the multiple comparisons with the best methodology. Our method employs Monte Carlo simulation to compute the number of individuals to enroll in an arbitrary SMART. We evaluate our method through extensive simulation studies. We illustrate our method by retrospectively computing the power in the Extending Treatment Effectiveness of Naltrexone (EXTEND) trial. An R package implementing our methodology is available to download from the Comprehensive R Archive Network.


2017 ◽  
Vol 26 (4) ◽  
pp. 1641-1653 ◽  
Author(s):  
Michael P Wallace ◽  
Erica EM Moodie ◽  
David A Stephens

Model assessment is a standard component of statistical analysis, but it has received relatively little attention within the dynamic treatment regime literature. In this paper, we focus on the dynamic-weighted ordinary least squares approach to optimal dynamic treatment regime estimation, introducing how its double-robustness property may be leveraged for model assessment, and how quasilikelihood may be used for model selection. These ideas are demonstrated through simulation studies, as well as through application to data from the sequenced treatment alternatives to relieve depression study.


2016 ◽  
Vol 12 (1) ◽  
pp. 157-177
Author(s):  
Benjamin Rich ◽  
Erica E. M. Moodie ◽  
David A. Stephens

Abstract Individualized medicine is an area that is growing, both in clinical and statistical settings, where in the latter, personalized treatment strategies are often referred to as dynamic treatment regimens. Estimation of the optimal dynamic treatment regime has focused primarily on semi-parametric approaches, some of which are said to be doubly robust in that they give rise to consistent estimators provided at least one of two models is correctly specified. In particular, the locally efficient doubly robust g-estimation is robust to misspecification of the treatment-free outcome model so long as the propensity model is specified correctly, at the cost of an increase in variability. In this paper, we propose data-adaptive weighting schemes that serve to decrease the impact of influential points and thus stabilize the estimator. In doing so, we provide a doubly robust g-estimator that is also robust in the sense of Hampel (15).


Sign in / Sign up

Export Citation Format

Share Document