On transforming an index for generalised bandit problems
Keyword(s):
Nash (1980) demonstrated that index policies are optimal for a class of generalised bandit problem. A transform of the index concerned has many of the attributes of the Gittins index. The transformed index is positive-valued, with maximal values yielding optimal actions. It may be characterised as the value of a restart problem and is hence computable via dynamic programming methodologies. The transformed index can also be used in procedures for policy evaluation.
Keyword(s):
2008 ◽
Vol 40
(02)
◽
pp. 377-400
◽
Keyword(s):
2008 ◽
Vol 40
(2)
◽
pp. 377-400
◽
Keyword(s):
2006 ◽
Vol 38
(3)
◽
pp. 643-672
◽
2016 ◽
Vol 31
(2)
◽
pp. 239-263
◽
2002 ◽
Vol 34
(04)
◽
pp. 754-774
◽
Keyword(s):
1988 ◽
Vol 20
(02)
◽
pp. 447-472
◽