scholarly journals Value of structural health information in partially observable stochastic environments

2021 ◽  
Vol 93 ◽  
pp. 102072
Author(s):  
Charalampos P. Andriotis ◽  
Konstantinos G. Papakonstantinou ◽  
Eleni N. Chatzi
Author(s):  
Jan Leike ◽  
Tor Lattimore ◽  
Laurent Orseau ◽  
Marcus Hutter

We discuss some recent results on Thompson sampling for nonparametric reinforcement learning in countable classes of general stochastic environments. These environments can be non-Markovian, non-ergodic, and partially observable. We show that Thompson sampling learns the environment class in the sense that (1) asymptotically its value converges in mean to the optimal value and (2) given a recoverability assumption regret is sublinear. We conclude with a discussion about optimality in reinforcement learning.


Sign in / Sign up

Export Citation Format

Share Document