Optimal Reinforcement Learning with Asymmetric Updating in Volatile Environments: a Simulation Study

AbstractThe ability to predict the future is essential for decision-making and interaction with the environment to avoid punishment and gain reward. Reinforcement learning algorithms provide a normative way for interactive learning, especially in volatile environments. The optimal strategy for the classic reinforcement learning model is to increase the learning rate as volatility increases. Inspired by optimistic bias in humans, an alternative reinforcement learning model has been developed by adding a punishment learning rate to the classic reinforcement learning model. In this study, we aim to 1) compare the performance of these two models in interaction with different environments, and 2) find optimal parameters for the models. Our simulations indicate that having two different learning rates for rewards and punishments increases performance in a volatile environment. Investigation of the optimal parameters shows that in almost all environments, having a higher reward learning rate compared to the punishment learning rate is beneficial for achieving higher performance which in this case is the accumulation of more rewards. Our results suggest that to achieve high performance, we need a shorter memory window for recent rewards and a longer memory window for punishments. This is consistent with optimistic bias in human behavior.

Download Full-text

Evaluation and Potential Improvements of a Deep Reinforcement Learning Model for Automated Stock Trading

SSRN Electronic Journal ◽

10.2139/ssrn.3786304 ◽

2021 ◽

Author(s):

Rainer Andreas Jager

Keyword(s):

Reinforcement Learning ◽

Learning Model ◽

Stock Trading ◽

Reinforcement Learning Model

Download Full-text

A Reinforcement Learning Model for Robots as Teachers*

2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) ◽

10.1109/roman.2018.8525563 ◽

2018 ◽

Author(s):

Sayanti Roy ◽

Christopher Crick ◽

Emily Kieson ◽

Charles Abramson

Keyword(s):

Reinforcement Learning ◽

Learning Model ◽

Reinforcement Learning Model

Download Full-text

Modeling individual variation in visual search with reinforcement learning

10.31234/osf.io/suj28 ◽

2020 ◽

Author(s):

Ben Lonnqvist ◽

Micha Elsner ◽

Amelia R. Hunt ◽

Alasdair D F Clarke

Keyword(s):

Visual Acuity ◽

Reinforcement Learning ◽

Visual Search ◽

Individual Variation ◽

Learning Model ◽

Participant Preferences ◽

Search Experiment ◽

Reinforcement Learning Model ◽

Specific Learning

Experiments on the efficiency of human search sometimes reveal large differences between individual participants. We argue that reward-driven task-specific learning may account for some of this variation. In a computational reinforcement learning model of this process, a wide variety of strategies emerge, despite all simulated participants having the same visual acuity. We conduct a visual search experiment, and replicate previous findings that participant preferences about where to search are highly varied, with a distribution comparable to the simulated results. Thus, task-specific learning is an under-explored mechanism by which large inter-participant differences can arise.

Download Full-text