Restless bandits, partial conservation laws and indexability

José Niño-Mora

doi:10.1017/S0001867800010648

Restless bandits, partial conservation laws and indexability

Part of: Operations research and management science

Published online by Cambridge University Press: 01 July 2016

José Niño-Mora

Show author details

José Niño-Mora*: Affiliation:
Universitat Pompeu Fabra, Barcelona
*: ∗ Postal address: Department of Economics and Business, Universitat Pompeu Fabra, 08005 Barcelona, Spain. Email address: jose.nino-mora@econ.upf.es

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We show that if performance measures in a general stochastic scheduling problem satisfy partial conservation laws (PCL), which extend the generalized conservation laws (GCL) introduced by Bertsimas and Niño-Mora (1996), then the problem is solved optimally by a priority-index policy under a range of admissible linear performance objectives, with both this range and the optimal indices being determined by a one-pass adaptive-greedy algorithm that extends Klimov's: we call such scheduling problems PCL-indexable. We further apply the PCL framework to investigate the indexability property of restless bandits (two-action finite-state Markov decision chains) introduced by Whittle, obtaining the following results: (i) we present conditions on model parameters under which a single restless bandit is PCL-indexable, and hence indexable; membership of the class of PCL-indexable bandits is tested through a single run of the adaptive-greedy algorithm, which further computes the Whittle indices when the test is positive; this provides a tractable sufficient condition for indexability; (ii) we further introduce the subclass of GCL-indexable bandits (including classical bandits), which are indexable under arbitrary linear rewards. Our analysis is based on the achievable region approach to stochastic optimization, as the results follow from deriving and exploiting a new linear programming reformulation for single restless bandits.

Keywords

stochastic scheduling index policies Markov decision chains bandit problems achievable region

MSC classification

Primary: 90B35: Scheduling theory, deterministic

Secondary: 90C40: Markov and semi-Markov decision processes

Type: General Applied Probability
Information: Advances in Applied Probability , Volume 33 , Issue 1 , March 2001 , pp. 76 - 98

DOI: https://doi.org/10.1017/S0001867800010648 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 2001

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Bertsimas, D. and Niño-Mora, J. (1996). Conservation laws, extended polymatroids and multiarmed bandit problems; a unified approach to indexable systems. Math. Operat. Res. 21, 257–306.CrossRef Google Scholar

[2] Bertsimas, D. and Niño-Mora, J. (1999). Optimization of multiclass queueing networks with changeover times via the achievable region method: Part I, the single-station case. Math. Operat. Res. 24, 306–330.CrossRef Google Scholar

[3] Bertsimas, D. and Niño-Mora, J. (2000). Restless bandits, linear programming relaxations, and a primal-dual index heuristic. Operat. Res. 48, 80–90.Google Scholar

[4] Chen, Y. R. and Katehakis, M. N. (1986). Linear programming for finite state multi-armed bandit problems. Math. Operat. Res. 11, 180–183.CrossRef Google Scholar

[5] Dacre, M., Glazebrook, K. D. and Niño-Mora, J. (1999). The achievable region approach to the optimal control of stochastic systems (with discussion). J. R. Statist. Soc. B 61, 747–791.CrossRef Google Scholar

[6] D'Epénoux, F., (1960). Sur un problème de production et de stockage dans l'aléatoire. RAIRO Rech. Opérat. 14, 3–16. English translation: A probabilistic production and inventory problem. Management Sci. 10 (1963), 98–108.Google Scholar

[7] Faihe, Y. and Müller, J.-P. (1998). Behaviors coordination using restless bandits allocation indexes. In From Animals to Animats 5 (Proc. 5th Int. Conf. Simulation of Adaptive Behavior), eds Pfeifer, R., Blumberg, B., Meyer, J. A. and Wilson, S. W.. MIT Press, Cambridge, MA.Google Scholar

[8] Gittins, J. C. (1979). Bandit processes and dynamic allocation indices (with discussion). J. R. Statist. Soc. B 41, 148–177.Google Scholar

[9] Katehakis, M. N. and Veinott, A. F. Jr (1987). The multi-armed bandit problem: decomposition and computation. Math. Operat. Res. 12, 262–268.CrossRef Google Scholar

[10] Klimov, G. P. (1974). Time sharing service systems I. Theory Prob. Appl. 19, 532–551.CrossRef Google Scholar

[11] Manne, A. S. (1960). Linear programming and sequential decisions. Management Sci. 6, 259–267.CrossRef Google Scholar

[12] Niño-Mora, J. (2000). Admission control to birth–death queueing systems, restless bandit indices, and partial conservation laws. Working paper, Department of Economics and Business, Universitat Pompeu Fabra.Google Scholar

[13] Papadimitriou, C. H. and Tsitsiklis, J. N. (1999). The complexity of optimal queueing network control. Math. Operat. Res. 24, 293–305.Google Scholar

[14] Veatch, M. and Wein, L. M. (1996). Scheduling a make-to-stock queue: index policies and hedging points. Operat. Res. 44, 634–647.Google Scholar

[15] Weber, R. R. and Weiss, G. (1990). On an index policy for restless bandits. J. Appl. Prob. 27, 637–648.CrossRef Google Scholar

[16] Weber, R. R. and Weiss, G. (1991). Addendum to ‘On an index policy for restless bandits’. Adv. Appl. Prob. 23, 429–430.CrossRef Google Scholar

[17] Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In A Celebration of Applied Probability (J. Appl. Prob. 25A), ed. Gani, J.. Applied Probability Trust, Sheffield, pp. 287–298.Google Scholar

Article contents

Restless bandits, partial conservation laws and indexability

Abstract

Keywords

MSC classification

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests