Hostname: page-component-7c8c6479df-94d59 Total loading time: 0 Render date: 2024-03-28T08:35:41.144Z Has data issue: false hasContentIssue false

Index policies for a class of discounted restless bandits

Published online by Cambridge University Press:  01 July 2016

K. D. Glazebrook
Affiliation:
University of Newcastle upon Tyne
J. Niño-Mora*
Affiliation:
Universitat Pompeu Fabra
P. S. Ansell*
Affiliation:
University of Newcastle upon Tyne
*
∗∗ Postal address: Department of Economics and Business, Universitat Pompeu Fabra, E-08005, Barcelona, Spain.
∗∗∗ Postal address: School of Mathematics and Statistics, University of Newcastle upon Tyne, Newcastle upon Tyne NE1 7RU, UK.

Abstract

The paper concerns a class of discounted restless bandit problems which possess an indexability property. Conservation laws yield an expression for the reward suboptimality of a general policy. These results are utilised to study the closeness to optimality of an index policy for a special class of simple and natural dual speed restless bandits for which indexability is guaranteed. The strong performance of the index policy is confirmed by a computational study.

Type
General Applied Probability
Copyright
Copyright © Applied Probability Trust 2002 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Current address: School of Management, University of Edinburgh, William Robertson Building, 50 George Square, Edinburgh EH8 9JY, UK. Email address: kevin.glazebrook@ed.ac.uk

References

Bertsimas, D. and Niño-Mora, J. (1996). Conservation laws, extended polymatroids and multi-armed bandit problems: a polyhedral approach to indexable systems. Math. Operat. Res. 21, 257306.CrossRefGoogle Scholar
Faihe, Y. and Müller, J.-P. (1998). Behaviors coordination using restless bandit allocation indices. In From Animals to Animats 5 (Proc. 5th Internat. Conf. Simulation of Adaptive Behavior, Zürich), eds Pfeifer, R. et al., MIT Press, Cambridge, MA.Google Scholar
Gittins, J. C. (1979). Bandit processes and dynamic allocation indices (with discussion). J. R. Statist. Soc. B 41, 148177.Google Scholar
Gittins, J. C. (1989). Multi-Armed Bandit Allocation Indices. John Wiley, New York.Google Scholar
Glazebrook, K. D. and Garbe, R. (1999). Almost optimal policies for stochastic systems which almost satisfy conservation laws. Ann. Operat. Res. 92, 1943.CrossRefGoogle Scholar
Glazebrook, K. D. and Niño-Mora, J. (2001). Parallel scheduling of multiclass M/M/m queues: approximate and heavy-traffic optimization of achievable performance. To appear in Operat. Res. 49, 609623.Google Scholar
Glazebrook, K. D. and Wilkinson, D. J. (2000). Index-based policies for discounted multi-armed bandits on parallel machines. Ann. Appl. Prob. 10, 877896.CrossRefGoogle Scholar
Glazebrook, K. D., Niño-Mora, J. and Ansell, P. S. (2000). Index policies for a class of discounted restless bandits. Tech. Rep., University of Newcastle upon Tyne.Google Scholar
Niño-Mora, J. (1999). Restless bandits, partial conservation laws and indexability. Working paper 435, Department of Economics and Business, Universitat Pompeu Fabra, Barcelona.Google Scholar
Papadimitriou, C. H. and Tsitsiklis, J. N. (1999). The complexity of optimal queueing network control. Math. Operat. Res. 24, 293305.Google Scholar
Varaiya, P. P., Walrand, J. C. and Buyukkoc, C. (1985). Extensions of the multi-armed bandit problem: the discounted case. IEEE Trans. Automatic Control 30, 426439.Google Scholar
Veatch, M. and Wein, L. M. (1996). Scheduling a make-to-stock queue: index policies and hedging points. Operat. Res. 44, 634647.CrossRefGoogle Scholar
Weber, R. R. and Weiss, G. (1990). On an index policy for restless bandits. J. Appl. Prob. 27, 637648.Google Scholar
Weber, R. R. and Weiss, G. (1991). Addendum to ‘On an index policy for restless bandits’. Adv. Appl. Prob. 23, 429430.Google Scholar
Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In A Celebration of Applied Probability (J. Appl. Prob. Spec. Vol. 25A), ed. Gani, J., Applied Probability Trust, Sheffield, pp. 287298.Google Scholar