Skip to main content Accessibility help
×
Home

Index policies for a class of discounted restless bandits

  • K. D. Glazebrook (a1), J. Niño-Mora (a2) and P. S. Ansell (a1)

Abstract

The paper concerns a class of discounted restless bandit problems which possess an indexability property. Conservation laws yield an expression for the reward suboptimality of a general policy. These results are utilised to study the closeness to optimality of an index policy for a special class of simple and natural dual speed restless bandits for which indexability is guaranteed. The strong performance of the index policy is confirmed by a computational study.

Copyright

Corresponding author

∗∗ Postal address: Department of Economics and Business, Universitat Pompeu Fabra, E-08005, Barcelona, Spain.
∗∗∗ Postal address: School of Mathematics and Statistics, University of Newcastle upon Tyne, Newcastle upon Tyne NE1 7RU, UK.

Footnotes

Hide All

Current address: School of Management, University of Edinburgh, William Robertson Building, 50 George Square, Edinburgh EH8 9JY, UK. Email address: kevin.glazebrook@ed.ac.uk

Footnotes

References

Hide All
Bertsimas, D. and Niño-Mora, J. (1996). Conservation laws, extended polymatroids and multi-armed bandit problems: a polyhedral approach to indexable systems. Math. Operat. Res. 21, 257306.
Faihe, Y. and Müller, J.-P. (1998). Behaviors coordination using restless bandit allocation indices. In From Animals to Animats 5 (Proc. 5th Internat. Conf. Simulation of Adaptive Behavior, Zürich), eds Pfeifer, R. et al., MIT Press, Cambridge, MA.
Gittins, J. C. (1979). Bandit processes and dynamic allocation indices (with discussion). J. R. Statist. Soc. B 41, 148177.
Gittins, J. C. (1989). Multi-Armed Bandit Allocation Indices. John Wiley, New York.
Glazebrook, K. D. and Garbe, R. (1999). Almost optimal policies for stochastic systems which almost satisfy conservation laws. Ann. Operat. Res. 92, 1943.
Glazebrook, K. D. and Niño-Mora, J. (2001). Parallel scheduling of multiclass M/M/m queues: approximate and heavy-traffic optimization of achievable performance. To appear in Operat. Res. 49, 609623.
Glazebrook, K. D. and Wilkinson, D. J. (2000). Index-based policies for discounted multi-armed bandits on parallel machines. Ann. Appl. Prob. 10, 877896.
Glazebrook, K. D., Niño-Mora, J. and Ansell, P. S. (2000). Index policies for a class of discounted restless bandits. Tech. Rep., University of Newcastle upon Tyne.
Niño-Mora, J. (1999). Restless bandits, partial conservation laws and indexability. Working paper 435, Department of Economics and Business, Universitat Pompeu Fabra, Barcelona.
Papadimitriou, C. H. and Tsitsiklis, J. N. (1999). The complexity of optimal queueing network control. Math. Operat. Res. 24, 293305.
Varaiya, P. P., Walrand, J. C. and Buyukkoc, C. (1985). Extensions of the multi-armed bandit problem: the discounted case. IEEE Trans. Automatic Control 30, 426439.
Veatch, M. and Wein, L. M. (1996). Scheduling a make-to-stock queue: index policies and hedging points. Operat. Res. 44, 634647.
Weber, R. R. and Weiss, G. (1990). On an index policy for restless bandits. J. Appl. Prob. 27, 637648.
Weber, R. R. and Weiss, G. (1991). Addendum to ‘On an index policy for restless bandits’. Adv. Appl. Prob. 23, 429430.
Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In A Celebration of Applied Probability (J. Appl. Prob. Spec. Vol. 25A), ed. Gani, J., Applied Probability Trust, Sheffield, pp. 287298.

Keywords

MSC classification

Related content

Powered by UNSILO

Index policies for a class of discounted restless bandits

  • K. D. Glazebrook (a1), J. Niño-Mora (a2) and P. S. Ansell (a1)

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.