Some indexable families of restless bandit problems

K. D. Glazebrook; D. Ruiz-Hernandez; C. Kirkbride

doi:10.1239/aap/1158684996

Some indexable families of restless bandit problems

Part of: Numerical methods in calculus of variations and optimal control Hamilton-Jacobi theories, including dynamic programming

Published online by Cambridge University Press: 01 July 2016

K. D. Glazebrook ,

D. Ruiz-Hernandez and

C. Kirkbride

Show author details

K. D. Glazebrook*: Affiliation:
Lancaster University
D. Ruiz-Hernandez*: Affiliation:
Universitat Pompeu Fabra
C. Kirkbride*: Affiliation:
Lancaster University
*: ∗ Postal address: Department of Mathematics and Statistics, Lancaster University, Lancaster, LA1 4YF, UK. Email address: k.glazebrook@lancaster.ac.uk
∗∗ Postal address: Department of Economics and Business, Universitat Pompeu Fabra, E-08005 Barcelona, Spain.
∗∗∗ Postal address: Department of Management Science, Lancaster University, Lancaster, LA1 4YX, UK.

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

In 1988 Whittle introduced an important but intractable class of restless bandit problems which generalise the multiarmed bandit problems of Gittins by allowing state evolution for passive projects. Whittle's account deployed a Lagrangian relaxation of the optimisation problem to develop an index heuristic. Despite a developing body of evidence (both theoretical and empirical) which underscores the strong performance of Whittle's index policy, a continuing challenge to implementation is the need to establish that the competing projects all pass an indexability test. In this paper we employ Gittins' index theory to establish the indexability of (inter alia) general families of restless bandits which arise in problems of machine maintenance and stochastic scheduling problems with switching penalties. We also give formulae for the resulting Whittle indices. Numerical investigations testify to the outstandingly strong performance of the index heuristics concerned.

Keywords

Bandit problem dynamic programming Gittins index machine maintenance restless bandit stochastic scheduling switching cost

MSC classification

Primary: 90C40: Markov and semi-Markov decision processes

Secondary: 49L20: Dynamic programming method 90C39: Dynamic programming 49M20: Methods of relaxation type

Type: General Applied Probability
Information: Advances in Applied Probability , Volume 38 , Issue 3 , September 2006 , pp. 643 - 672

DOI: https://doi.org/10.1239/aap/1158684996 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 2006

References

Agrawal, R., Hedge, M. and Teneketzis, D. (1988). Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost. IEEE Trans. Automatic Control 33, 899–906.Google Scholar

Ansell, P. S., Glazebrook, K. D., Niño-Mora, J. and O'Keeffe, M. (2003). Whittle's index policy for a multi-class queueing system with convex holding costs. Math. Meth. Operat. Res. 57, 21–39.Google Scholar

Asawa, M. and Teneketzis, D. (1996). Multi-armed bandits with switching penalties. IEEE Trans. Automatic Control 41, 328–348.CrossRef Google Scholar

Banks, J. S. and Sundaram, R. (1994). Switching costs and the Gittins index. Econometrica 62, 687–694.CrossRef Google Scholar

Gittins, J. C. (1979). Bandit processes and dynamic allocation indices (with discussion). J. R. Statist. Soc. B 41, 148–177.Google Scholar

Gittins, J. C. (1989). Multi-Armed Bandit Allocation Indices. John Wiley, Chichester.Google Scholar

Glazebrook, K. D. (1980). On stochastic scheduling with precedence relations and switching costs. J. Appl. Prob. 17, 1016–1024.Google Scholar

Glazebrook, K. D., Mitchell, H. M. and Ansell, P. S. (2005). Index policies for the maintenance of a collection of machines by a set of repairmen. Europ. J. Operat. Res. 165, 267–284.Google Scholar

Glazebrook, K. D., Niño-Mora, J. and Ansell, P. S. (2002). Index policies for a class of discounted restless bandits. Adv. Appl. Prob. 34, 754–774.Google Scholar

Nash, P. (1979). Optimal allocation of resources between research projects. , University of Cambridge.Google Scholar

Niño-Mora, J. (2001). Restless bandits, partial conservation laws and indexability. Adv. Appl. Prob. 33, 76–98.Google Scholar

Niño-Mora, J. (2002). Dynamic allocation indices for restless projects and queueing admission control: a polyhedral approach. Math. Program. 93, 361–413.CrossRef Google Scholar

Papadimitriou, C. H. and Tsitsiklis, J. N. (1999). The complexity of optimal queuing network control. Math. Operat. Res. 24, 293–305.CrossRef Google Scholar

Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.Google Scholar

Reiman, M. I. and Wein, L. M. (1998). Dynamic scheduling of a two-class queue with setups. Operat. Res. 46, 532–547.Google Scholar

Van Oyen, M. P. and Teneketzis, D. (1994). Optimal stochastic scheduling of forest networks with switching penalties. Adv. Appl. Prob. 26, 474–479.Google Scholar

Weber, R. R. and Weiss, G. (1990). On an index policy for restless bandits. J. Appl. Prob. 27, 637–648. (Addendum: Adv. Appl. Prob. 23 (1991), 429-430.)Google Scholar

Whittle, P. (1980). Multi-armed bandits and the Gittins index. J. R. Statist. Soc. B 42, 143–149.Google Scholar

Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In A Celebration of Applied Probability (J. Appl. Prob. Spec. Vol. 25A), ed. Gani, J., Applied Probability Trust, Sheffield, pp. 287–298.Google Scholar

Whittle, P. (1996). Optimal Control: Basics and Beyond. John Wiley, Chichester.Google Scholar

Article contents

Some indexable families of restless bandit problems

Abstract

Keywords

MSC classification

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests