Learning User Preferences via Reinforcement Learning with Spatial Interface Valuing

Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems

Machine Learning ◽

10.1007/s10994-020-05939-8 ◽

2021 ◽

Author(s):

Amarildo Likmeta ◽

Alberto Maria Metelli ◽

Giorgia Ramponi ◽

Andrea Tirinzoni ◽

Matteo Giuliani ◽

...

Keyword(s):

Reinforcement Learning ◽

Real World ◽

Real Life ◽

User Preferences ◽

Inverse Reinforcement Learning ◽

Water Release ◽

Reward Function ◽

Model Free ◽

Conflicting Objectives ◽

Multiple Experts

AbstractIn real-world applications, inferring the intentions of expert agents (e.g., human operators) can be fundamental to understand how possibly conflicting objectives are managed, helping to interpret the demonstrated behavior. In this paper, we discuss how inverse reinforcement learning (IRL) can be employed to retrieve the reward function implicitly optimized by expert agents acting in real applications. Scaling IRL to real-world cases has proved challenging as typically only a fixed dataset of demonstrations is available and further interactions with the environment are not allowed. For this reason, we resort to a class of truly batch model-free IRL algorithms and we present three application scenarios: (1) the high-level decision-making problem in the highway driving scenario, and (2) inferring the user preferences in a social network (Twitter), and (3) the management of the water release in the Como Lake. For each of these scenarios, we provide formalization, experiments and a discussion to interpret the obtained results.

Download Full-text

Online Caching Policy with User Preferences and Time-Dependent Requests: A Reinforcement Learning Approach

2019 53rd Asilomar Conference on Signals, Systems, and Computers ◽

10.1109/ieeeconf44664.2019.9048832 ◽

2019 ◽

Cited By ~ 2

Author(s):

Mohammad Hatami ◽

Markus Leinonen ◽

Marian Codreanu

Keyword(s):

Reinforcement Learning ◽

User Preferences ◽

Time Dependent ◽

Learning Approach ◽

Caching Policy

Download Full-text

Split Q Learning: Reinforcement Learning with Two-Stream Rewards

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/913 ◽

2019 ◽

Author(s):

Baihan Lin ◽

Djallel Bouneffouf ◽

Guillermo Cecchi

Keyword(s):

Reinforcement Learning ◽

Wide Spectrum ◽

User Preferences ◽

Reward Processing ◽

Q Learning ◽

Agent Interactions ◽

Behavioral Studies ◽

Human Decision ◽

Multi Agent ◽

Learning Reinforcement

Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for a reinforcement learning problem, which extends the standard Q-learning approach to incorporate a two-stream framework of reward processing with biases biologically associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. For AI community, the development of agents that react differently to different types of rewards can enable us to understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions and user preferences in long-term recommendation systems.

Download Full-text