scholarly journals Distributed Recommender Profiling and Selection with Gittins Indices

Author(s):  
Li-tung Weng ◽  
Yue Xu ◽  
Yuefeng Li ◽  
Richi Nayak
Keyword(s):  

2007 ◽  
Vol 44 (02) ◽  
pp. 554-559 ◽  
Author(s):  
Roger Filliger ◽  
Max-Olivier Hongler

We explicitly calculate the dynamic allocation indices (i.e. the Gittins indices) for multi-armed Bandit processes driven by superdiffusive noise sources. This class of model generalizes former results derived by Karatzas for diffusive processes. In particular, the Gittins indices do, in this soluble class of superdiffusive models, explicitly depend on the noise state.



1991 ◽  
Vol 23 (04) ◽  
pp. 975-977
Author(s):  
You-Gan Wang ◽  
John Gittins

The Bernoulli/exponential target process is considered. Such processes have been found useful in modelling the search for active compounds in pharmaceutical research. An inequality is presented which improves a result of Gittins (1989), thus providing a better approximation to the Gittins indices which define the optimal search policy.



2020 ◽  
Author(s):  
Daniel Russo

This note gives a short, self-contained proof of a sharp connection between Gittins indices and Bayesian upper confidence bound algorithms. I consider a Gaussian multiarmed bandit problem with discount factor [Formula: see text]. The Gittins index of an arm is shown to equal the [Formula: see text]-quantile of the posterior distribution of the arm's mean plus an error term that vanishes as [Formula: see text]. In this sense, for sufficiently patient agents, a Gittins index measures the highest plausible mean-reward of an arm in a manner equivalent to an upper confidence bound.



Biometrika ◽  
1991 ◽  
Vol 78 (1) ◽  
pp. 101-111 ◽  
Author(s):  
YOU-GAN WANG


1988 ◽  
Vol 20 (2) ◽  
pp. 447-472 ◽  
Author(s):  
Tze Leung Lai ◽  
Zhiliang Ying

Asymptotic approximations are developed herein for the optimal policies in discounted multi-armed bandit problems in which new projects are continually appearing, commonly known as ‘open bandit problems’ or ‘arm-acquiring bandits’. It is shown that under certain stability assumptions the open bandit problem is asymptotically equivalent to a closed bandit problem in which there is no arrival of new projects, as the discount factor approaches 1. Applications of these results to optimal scheduling of queueing networks are given. In particular, Klimov&s priority indices for scheduling queueing networks are shown to be limits of the Gittins indices for the associated closed bandit problem, and extensions of Klimov&s results to preemptive policies and to unstable queueing systems are given.



1988 ◽  
Vol 20 (02) ◽  
pp. 447-472 ◽  
Author(s):  
Tze Leung Lai ◽  
Zhiliang Ying

Asymptotic approximations are developed herein for the optimal policies in discounted multi-armed bandit problems in which new projects are continually appearing, commonly known as ‘open bandit problems’ or ‘arm-acquiring bandits’. It is shown that under certain stability assumptions the open bandit problem is asymptotically equivalent to a closed bandit problem in which there is no arrival of new projects, as the discount factor approaches 1. Applications of these results to optimal scheduling of queueing networks are given. In particular, Klimov&s priority indices for scheduling queueing networks are shown to be limits of the Gittins indices for the associated closed bandit problem, and extensions of Klimov&s results to preemptive policies and to unstable queueing systems are given.



2001 ◽  
Vol 33 (2) ◽  
pp. 365-390 ◽  
Author(s):  
R. T. Dunn ◽  
K. D. Glazebrook

We consider generalisations of two classical stochastic scheduling models, namely the discounted branching bandit and the discounted multi-armed bandit, to the case where the collection of machines available for processing is itself a stochastic process. Under rather mild conditions on the machine availability process we obtain performance guarantees for a range of controls based on Gittins indices. Various forms of asymptotic optimality are established for index-based limit policies as the discount rate approaches 0.





DGOR ◽  
1986 ◽  
pp. 548-548
Author(s):  
Lodewijk C. M. Kallenberg


2007 ◽  
Vol 44 (02) ◽  
pp. 554-559
Author(s):  
Roger Filliger ◽  
Max-Olivier Hongler

We explicitly calculate the dynamic allocation indices (i.e. the Gittins indices) for multi-armed Bandit processes driven by superdiffusive noise sources. This class of model generalizes former results derived by Karatzas for diffusive processes. In particular, the Gittins indices do, in this soluble class of superdiffusive models, explicitly depend on the noise state.



Sign in / Sign up

Export Citation Format

Share Document