Measuring Trial-wise Choice Difficulty in Multi-feature Reinforcement Learning

Author(s):  
Peter Hitchcock ◽  
Yael Niv ◽  
Angela Radulescu ◽  
Nina Jill Rothstein ◽  
Chris R. Sims

Real world reinforcement learning (RL) requires learning about stimuli composed of multiple features, only some of which are relevant to reinforcement. We investigated RL in a multi-feature task known as the Dimensions Task. Past work developed a computational model of this task, where the expected value of a stimulus comprises weights assigned to the stimulus’s features, hence the weights estimate the importance of each feature. We studies these weights and how they relate to human behavior. We found a sparse subset of features accrued much weight, and just 2 of 9 features exerted a significant influence on reaction time (RT), suggesting this pair of features mostly influences choice. These findings clarify that the Dimensions Task requires selectively attending to just a sparse subset of features while ignoring numerous irrelevant features, emphasizing its distinction from other recent multi-feature RL tasks that either require attending to all features or learning to treat feature conjunctions as objects. We next examined whether we could use the feature weights to develop a trial-wise marker of choice difficulty that related to individual differences. We found that high (vs. low) performing participants were better able to calibrate their responses based on variation in the standard deviation (SD) of the 2 features influencing RT. This suggests better-performing participants may be more responsive to the difference between the features. We discuss how this measure of trial-wise choice difficulty could be applied in experimental and translational research.

2021 ◽  
Vol 22 (2) ◽  
pp. 12-18 ◽  
Author(s):  
Hua Wei ◽  
Guanjie Zheng ◽  
Vikash Gayah ◽  
Zhenhui Li

Traffic signal control is an important and challenging real-world problem that has recently received a large amount of interest from both transportation and computer science communities. In this survey, we focus on investigating the recent advances in using reinforcement learning (RL) techniques to solve the traffic signal control problem. We classify the known approaches based on the RL techniques they use and provide a review of existing models with analysis on their advantages and disadvantages. Moreover, we give an overview of the simulation environments and experimental settings that have been developed to evaluate the traffic signal control methods. Finally, we explore future directions in the area of RLbased traffic signal control methods. We hope this survey could provide insights to researchers dealing with real-world applications in intelligent transportation systems


2021 ◽  
Author(s):  
Gabriel Dulac-Arnold ◽  
Nir Levine ◽  
Daniel J. Mankowitz ◽  
Jerry Li ◽  
Cosmin Paduraru ◽  
...  

2021 ◽  
Author(s):  
Amarildo Likmeta ◽  
Alberto Maria Metelli ◽  
Giorgia Ramponi ◽  
Andrea Tirinzoni ◽  
Matteo Giuliani ◽  
...  

AbstractIn real-world applications, inferring the intentions of expert agents (e.g., human operators) can be fundamental to understand how possibly conflicting objectives are managed, helping to interpret the demonstrated behavior. In this paper, we discuss how inverse reinforcement learning (IRL) can be employed to retrieve the reward function implicitly optimized by expert agents acting in real applications. Scaling IRL to real-world cases has proved challenging as typically only a fixed dataset of demonstrations is available and further interactions with the environment are not allowed. For this reason, we resort to a class of truly batch model-free IRL algorithms and we present three application scenarios: (1) the high-level decision-making problem in the highway driving scenario, and (2) inferring the user preferences in a social network (Twitter), and (3) the management of the water release in the Como Lake. For each of these scenarios, we provide formalization, experiments and a discussion to interpret the obtained results.


Author(s):  
Mana Kobayashi ◽  
Yutaro Kageyama ◽  
Takashi Ando ◽  
Junko Sakamoto ◽  
Shohji Kimura

Abstract Background Rituximab is conditionally approved in Japan for use in patients with refractory nephrotic syndrome. To meet the conditions of approval, an all-case post-marketing surveillance study was conducted to confirm the real-world safety and efficacy of rituximab in patients of all ages with refractory nephrotic syndrome. Methods All patients scheduled to receive rituximab treatment for refractory nephrotic syndrome were eligible to register (registration: August 29, 2014 through April 15, 2016); the planned observation period was 2 years from the initiation of rituximab treatment (intravenous infusion, 375 mg/m2 once weekly for four doses). The study was conducted at 227 hospitals throughout Japan. Adverse drug reactions (ADRs) were collected for safety outcomes. The efficacy outcomes were relapse-free period and the degree of growth in pediatric (< 15 years) patients. Results In total, 997 (447 pediatric) patients were registered; 981 (445) were included in the safety analysis set; 852 (402) completed the 2-year observation period; and 810 (429) were included in the efficacy analysis set. Refractory nephrotic syndrome had developed in childhood for 85.0% of patients, and 54.6% were aged ≥15 years. ADRs were observed in 527 (53.7%) patients, treatment-related infection/infestation in 235 (24.0%) patients, and infusion reactions in 313 (31.9%) patients. The relapse-free period was 580 days (95% confidence interval, 511–664). There was a significant change in height standard deviation score (pediatric patients; mean change, 0.093; standard deviation, 0.637; P = 0.009). Conclusion The safety and efficacy of rituximab treatment in patients with refractory nephrotic syndrome were confirmed in the real-world setting. Clinical trial registration UMIN000014997.


2021 ◽  
pp. 193229682110075
Author(s):  
Rebecca A. Harvey Towers ◽  
Xiaohe Zhang ◽  
Rasoul Yousefi ◽  
Ghazaleh Esmaili ◽  
Liang Wang ◽  
...  

The algorithm for the Dexcom G6 CGM System was enhanced to retain accuracy while reducing the frequency and duration of sensor error. The new algorithm was evaluated by post-processing raw signals collected from G6 pivotal trials (NCT02880267) and by assessing the difference in data availability after a limited, real-world launch. Accuracy was comparable with the new algorithm—the overall %20/20 was 91.7% before and 91.8% after the algorithm modification; MARD was unchanged. The mean data gap due to sensor error nearly halved and total time spent in sensor error decreased by 59%. A limited field launch showed similar results, with a 43% decrease in total time spent in sensor error. Increased data availability may improve patient experience and CGM data integration into insulin delivery systems.


2021 ◽  
Vol 35 (2) ◽  
Author(s):  
Nicolas Bougie ◽  
Ryutaro Ichise

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.


2021 ◽  
pp. 027836492098785
Author(s):  
Julian Ibarz ◽  
Jie Tan ◽  
Chelsea Finn ◽  
Mrinal Kalakrishnan ◽  
Peter Pastor ◽  
...  

Deep reinforcement learning (RL) has emerged as a promising approach for autonomously acquiring complex behaviors from low-level sensor observations. Although a large portion of deep RL research has focused on applications in video games and simulated control, which does not connect with the constraints of learning in real environments, deep RL has also demonstrated promise in enabling physical robots to learn complex skills in the real world. At the same time, real-world robotics provides an appealing domain for evaluating such algorithms, as it connects directly to how humans learn: as an embodied agent in the real world. Learning to perceive and move in the real world presents numerous challenges, some of which are easier to address than others, and some of which are often not considered in RL research that focuses only on simulated domains. In this review article, we present a number of case studies involving robotic deep RL. Building off of these case studies, we discuss commonly perceived challenges in deep RL and how they have been addressed in these works. We also provide an overview of other outstanding challenges, many of which are unique to the real-world robotics setting and are not often the focus of mainstream RL research. Our goal is to provide a resource both for roboticists and machine learning researchers who are interested in furthering the progress of deep RL in the real world.


Sign in / Sign up

Export Citation Format

Share Document