Restless bandits, partial conservation laws and indexability

We show that if performance measures in a general stochastic scheduling problem satisfy partial conservation laws (PCL), which extend the generalized conservation laws (GCL) introduced by Bertsimas and Niño-Mora (1996), then the problem is solved optimally by a priority-index policy under a range of admissible linear performance objectives, with both this range and the optimal indices being determined by a one-pass adaptive-greedy algorithm that extends Klimov's: we call such scheduling problems PCL-indexable. We further apply the PCL framework to investigate the indexability property of restless bandits (two-action finite-state Markov decision chains) introduced by Whittle, obtaining the following results: (i) we present conditions on model parameters under which a single restless bandit is PCL-indexable, and hence indexable; membership of the class of PCL-indexable bandits is tested through a single run of the adaptive-greedy algorithm, which further computes the Whittle indices when the test is positive; this provides a tractable sufficient condition for indexability; (ii) we further introduce the subclass of GCL-indexable bandits (including classical bandits), which are indexable under arbitrary linear rewards. Our analysis is based on the achievable region approach to stochastic optimization, as the results follow from deriving and exploiting a new linear programming reformulation for single restless bandits.

Download Full-text

Some indexable families of restless bandit problems

Advances in Applied Probability ◽

10.1239/aap/1158684996 ◽

2006 ◽

Vol 38 (3) ◽

pp. 643-672 ◽

Cited By ~ 32

Author(s):

K. D. Glazebrook ◽

D. Ruiz-Hernandez ◽

C. Kirkbride

Keyword(s):

Index Theory ◽

Stochastic Scheduling ◽

Gittins Index ◽

Scheduling Problems ◽

Bandit Problems ◽

Index Policy ◽

Restless Bandit ◽

Machine Maintenance ◽

State Evolution ◽

Strong Performance

In 1988 Whittle introduced an important but intractable class of restless bandit problems which generalise the multiarmed bandit problems of Gittins by allowing state evolution for passive projects. Whittle's account deployed a Lagrangian relaxation of the optimisation problem to develop an index heuristic. Despite a developing body of evidence (both theoretical and empirical) which underscores the strong performance of Whittle's index policy, a continuing challenge to implementation is the need to establish that the competing projects all pass an indexability test. In this paper we employ Gittins' index theory to establish the indexability of (inter alia) general families of restless bandits which arise in problems of machine maintenance and stochastic scheduling problems with switching penalties. We also give formulae for the resulting Whittle indices. Numerical investigations testify to the outstandingly strong performance of the index heuristics concerned.

Download Full-text

Index policies for a class of discounted restless bandits

Advances in Applied Probability ◽

10.1017/s0001867800011903 ◽

2002 ◽

Vol 34 (04) ◽

pp. 754-774 ◽

Cited By ~ 8

Author(s):

K. D. Glazebrook ◽

J. Niño-Mora ◽

P. S. Ansell

Keyword(s):

Conservation Laws ◽

Special Class ◽

Computational Study ◽

Bandit Problems ◽

Index Policy ◽

Restless Bandit ◽

Restless Bandits ◽

Index Policies ◽

Strong Performance ◽

Dual Speed

The paper concerns a class of discounted restless bandit problems which possess an indexability property. Conservation laws yield an expression for the reward suboptimality of a general policy. These results are utilised to study the closeness to optimality of an index policy for a special class of simple and natural dual speed restless bandits for which indexability is guaranteed. The strong performance of the index policy is confirmed by a computational study.

Download Full-text