Measuring Trial-wise Choice Difficulty in Multi-feature Reinforcement Learning

Real world reinforcement learning (RL) requires learning about stimuli composed of multiple features, only some of which are relevant to reinforcement. We investigated RL in a multi-feature task known as the Dimensions Task. Past work developed a computational model of this task, where the expected value of a stimulus comprises weights assigned to the stimulus’s features, hence the weights estimate the importance of each feature. We studies these weights and how they relate to human behavior. We found a sparse subset of features accrued much weight, and just 2 of 9 features exerted a significant influence on reaction time (RT), suggesting this pair of features mostly influences choice. These findings clarify that the Dimensions Task requires selectively attending to just a sparse subset of features while ignoring numerous irrelevant features, emphasizing its distinction from other recent multi-feature RL tasks that either require attending to all features or learning to treat feature conjunctions as objects. We next examined whether we could use the feature weights to develop a trial-wise marker of choice difficulty that related to individual differences. We found that high (vs. low) performing participants were better able to calibrate their responses based on variation in the standard deviation (SD) of the 2 features influencing RT. This suggests better-performing participants may be more responsive to the difference between the features. We discuss how this measure of trial-wise choice difficulty could be applied in experimental and translational research.

Download Full-text

A Serial Reaction Time Task for Rats: Individual Differences in Sequence Encoding

PsycEXTRA Dataset ◽

10.1037/e603992013-111 ◽

2006 ◽

Author(s):

Stephen B. Fountain ◽

Amber M. Chenoweth

Keyword(s):

Individual Differences ◽

Reaction Time ◽

Reaction Time Task ◽

Serial Reaction Time Task ◽

Serial Reaction Time ◽

Time Task ◽

Serial Reaction ◽

Sequence Encoding

Download Full-text

Recent Advances in Reinforcement Learning for Traffic Signal Control

ACM SIGKDD Explorations Newsletter ◽

10.1145/3447556.3447565 ◽

2021 ◽

Vol 22 (2) ◽

pp. 12-18 ◽

Cited By ~ 1

Author(s):

Hua Wei ◽

Guanjie Zheng ◽

Vikash Gayah ◽

Zhenhui Li

Keyword(s):

Reinforcement Learning ◽

Real World ◽

Intelligent Transportation Systems ◽

Transportation Systems ◽

Traffic Signal ◽

Signal Control ◽

Traffic Signal Control ◽

Control Methods ◽

Advantages And Disadvantages ◽

Recent Advances

Traffic signal control is an important and challenging real-world problem that has recently received a large amount of interest from both transportation and computer science communities. In this survey, we focus on investigating the recent advances in using reinforcement learning (RL) techniques to solve the traffic signal control problem. We classify the known approaches based on the RL techniques they use and provide a review of existing models with analysis on their advantages and disadvantages. Moreover, we give an overview of the simulation environments and experimental settings that have been developed to evaluate the traffic signal control methods. Finally, we explore future directions in the area of RLbased traffic signal control methods. We hope this survey could provide insights to researchers dealing with real-world applications in intelligent transportation systems

Download Full-text

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

Machine Learning ◽

10.1007/s10994-021-05961-4 ◽

2021 ◽

Author(s):

Gabriel Dulac-Arnold ◽

Nir Levine ◽

Daniel J. Mankowitz ◽

Jerry Li ◽

Cosmin Paduraru ◽

...

Keyword(s):

Reinforcement Learning ◽

Real World

Download Full-text

Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems

Machine Learning ◽

10.1007/s10994-020-05939-8 ◽

2021 ◽

Author(s):

Amarildo Likmeta ◽

Alberto Maria Metelli ◽

Giorgia Ramponi ◽

Andrea Tirinzoni ◽

Matteo Giuliani ◽

...

Keyword(s):

Reinforcement Learning ◽

Real World ◽

Real Life ◽

User Preferences ◽

Inverse Reinforcement Learning ◽

Water Release ◽

Reward Function ◽

Model Free ◽

Conflicting Objectives ◽

Multiple Experts

AbstractIn real-world applications, inferring the intentions of expert agents (e.g., human operators) can be fundamental to understand how possibly conflicting objectives are managed, helping to interpret the demonstrated behavior. In this paper, we discuss how inverse reinforcement learning (IRL) can be employed to retrieve the reward function implicitly optimized by expert agents acting in real applications. Scaling IRL to real-world cases has proved challenging as typically only a fixed dataset of demonstrations is available and further interactions with the environment are not allowed. For this reason, we resort to a class of truly batch model-free IRL algorithms and we present three application scenarios: (1) the high-level decision-making problem in the highway driving scenario, and (2) inferring the user preferences in a social network (Twitter), and (3) the management of the water release in the Como Lake. For each of these scenarios, we provide formalization, experiments and a discussion to interpret the obtained results.

Download Full-text

All-case Japanese post-marketing surveillance of the real-world safety and efficacy of rituximab treatment in patients with refractory nephrotic syndrome

Clinical and Experimental Nephrology ◽

10.1007/s10157-021-02035-6 ◽

2021 ◽

Author(s):

Mana Kobayashi ◽

Yutaro Kageyama ◽

Takashi Ando ◽

Junko Sakamoto ◽

Shohji Kimura

Keyword(s):

Nephrotic Syndrome ◽

Standard Deviation ◽

Real World ◽

Pediatric Patients ◽

Safety And Efficacy ◽

The Real ◽

Post Marketing Surveillance ◽

Post Marketing ◽

Refractory Nephrotic Syndrome ◽

Observation Period

Abstract Background Rituximab is conditionally approved in Japan for use in patients with refractory nephrotic syndrome. To meet the conditions of approval, an all-case post-marketing surveillance study was conducted to confirm the real-world safety and efficacy of rituximab in patients of all ages with refractory nephrotic syndrome. Methods All patients scheduled to receive rituximab treatment for refractory nephrotic syndrome were eligible to register (registration: August 29, 2014 through April 15, 2016); the planned observation period was 2 years from the initiation of rituximab treatment (intravenous infusion, 375 mg/m2 once weekly for four doses). The study was conducted at 227 hospitals throughout Japan. Adverse drug reactions (ADRs) were collected for safety outcomes. The efficacy outcomes were relapse-free period and the degree of growth in pediatric (< 15 years) patients. Results In total, 997 (447 pediatric) patients were registered; 981 (445) were included in the safety analysis set; 852 (402) completed the 2-year observation period; and 810 (429) were included in the efficacy analysis set. Refractory nephrotic syndrome had developed in childhood for 85.0% of patients, and 54.6% were aged ≥15 years. ADRs were observed in 527 (53.7%) patients, treatment-related infection/infestation in 235 (24.0%) patients, and infusion reactions in 313 (31.9%) patients. The relapse-free period was 580 days (95% confidence interval, 511–664). There was a significant change in height standard deviation score (pediatric patients; mean change, 0.093; standard deviation, 0.637; P = 0.009). Conclusion The safety and efficacy of rituximab treatment in patients with refractory nephrotic syndrome were confirmed in the real-world setting. Clinical trial registration UMIN000014997.

Download Full-text

A Modified CGM Algorithm Enhances Data Availability While Retaining iCGM Performance

Journal of Diabetes Science and Technology ◽

10.1177/19322968211007521 ◽

2021 ◽

pp. 193229682110075

Author(s):

Rebecca A. Harvey Towers ◽

Xiaohe Zhang ◽

Rasoul Yousefi ◽

Ghazaleh Esmaili ◽

Liang Wang ◽

...

Keyword(s):

Data Integration ◽

Patient Experience ◽

Real World ◽

Data Availability ◽

Post Processing ◽

Sensor Error ◽

The Mean ◽

The Difference ◽

Data Gap ◽

Improve Patient

The algorithm for the Dexcom G6 CGM System was enhanced to retain accuracy while reducing the frequency and duration of sensor error. The new algorithm was evaluated by post-processing raw signals collected from G6 pivotal trials (NCT02880267) and by assessing the difference in data availability after a limited, real-world launch. Accuracy was comparable with the new algorithm—the overall %20/20 was 91.7% before and 91.8% after the algorithm modification; MARD was unchanged. The mean data gap due to sensor error nearly halved and total time spent in sensor error decreased by 59%. A limited field launch showed similar results, with a 43% decrease in total time spent in sensor error. Increased data availability may improve patient experience and CGM data integration into insulin delivery systems.

Download Full-text

Goal-driven active learning

Autonomous Agents and Multi-Agent Systems ◽

10.1007/s10458-021-09527-5 ◽

2021 ◽

Vol 35 (2) ◽

Author(s):

Nicolas Bougie ◽

Ryutaro Ichise

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Learning Process ◽

Real World ◽

Imitation Learning ◽

Learning Approaches ◽

Wide Range ◽

Fixed Set ◽

Complex Decision Making ◽

Complex Decision

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.

Download Full-text

Socially Compliant Robot Navigation in Crowded Environment by Human Behavior Resemblance Using Deep Reinforcement Learning

IEEE Robotics and Automation Letters ◽

10.1109/lra.2021.3071954 ◽

2021 ◽

Vol 6 (3) ◽

pp. 5223-5230

Author(s):

Sunil Srivatsav Samsani ◽

Mannan Saeed Muhammad

Keyword(s):

Reinforcement Learning ◽

Human Behavior ◽

Robot Navigation ◽

Compliant Robot ◽

Crowded Environment

Download Full-text

How to train your robot with deep reinforcement learning: lessons we have learned

The International Journal of Robotics Research ◽

10.1177/0278364920987859 ◽

2021 ◽

pp. 027836492098785

Author(s):

Julian Ibarz ◽

Jie Tan ◽

Chelsea Finn ◽

Mrinal Kalakrishnan ◽

Peter Pastor ◽

...

Keyword(s):

Machine Learning ◽

Reinforcement Learning ◽

Case Studies ◽

Real World ◽

Review Article ◽

The Real ◽

Complex Skills ◽

Real World Learning ◽

Level Sensor ◽

Embodied Agent

Deep reinforcement learning (RL) has emerged as a promising approach for autonomously acquiring complex behaviors from low-level sensor observations. Although a large portion of deep RL research has focused on applications in video games and simulated control, which does not connect with the constraints of learning in real environments, deep RL has also demonstrated promise in enabling physical robots to learn complex skills in the real world. At the same time, real-world robotics provides an appealing domain for evaluating such algorithms, as it connects directly to how humans learn: as an embodied agent in the real world. Learning to perceive and move in the real world presents numerous challenges, some of which are easier to address than others, and some of which are often not considered in RL research that focuses only on simulated domains. In this review article, we present a number of case studies involving robotic deep RL. Building off of these case studies, we discuss commonly perceived challenges in deep RL and how they have been addressed in these works. We also provide an overview of other outstanding challenges, many of which are unique to the real-world robotics setting and are not often the focus of mainstream RL research. Our goal is to provide a resource both for roboticists and machine learning researchers who are interested in furthering the progress of deep RL in the real world.

Download Full-text

Real-World Projectile Catching with Reinforcement Learning: Empirical Analysis using Discretized Simulations

2018 IEEE MIT Undergraduate Research Technology Conference (URTC) ◽

10.1109/urtc45901.2018.9244779 ◽

2018 ◽

Author(s):

Bryon Kucharski ◽

Adam Ziel ◽

Michael Hickey ◽

Collin Travers

Keyword(s):

Reinforcement Learning ◽

Empirical Analysis ◽

Real World

Download Full-text