Skip to main content Accessibility help
×
Home

Restless bandits, partial conservation laws and indexability

  • José Niño-Mora (a1)

Abstract

We show that if performance measures in a general stochastic scheduling problem satisfy partial conservation laws (PCL), which extend the generalized conservation laws (GCL) introduced by Bertsimas and Niño-Mora (1996), then the problem is solved optimally by a priority-index policy under a range of admissible linear performance objectives, with both this range and the optimal indices being determined by a one-pass adaptive-greedy algorithm that extends Klimov's: we call such scheduling problems PCL-indexable. We further apply the PCL framework to investigate the indexability property of restless bandits (two-action finite-state Markov decision chains) introduced by Whittle, obtaining the following results: (i) we present conditions on model parameters under which a single restless bandit is PCL-indexable, and hence indexable; membership of the class of PCL-indexable bandits is tested through a single run of the adaptive-greedy algorithm, which further computes the Whittle indices when the test is positive; this provides a tractable sufficient condition for indexability; (ii) we further introduce the subclass of GCL-indexable bandits (including classical bandits), which are indexable under arbitrary linear rewards. Our analysis is based on the achievable region approach to stochastic optimization, as the results follow from deriving and exploiting a new linear programming reformulation for single restless bandits.

Copyright

Corresponding author

Postal address: Department of Economics and Business, Universitat Pompeu Fabra, 08005 Barcelona, Spain. Email address: jose.nino-mora@econ.upf.es

References

Hide All
[1] Bertsimas, D. and Niño-Mora, J. (1996). Conservation laws, extended polymatroids and multiarmed bandit problems; a unified approach to indexable systems. Math. Operat. Res. 21, 257306.
[2] Bertsimas, D. and Niño-Mora, J. (1999). Optimization of multiclass queueing networks with changeover times via the achievable region method: Part I, the single-station case. Math. Operat. Res. 24, 306330.
[3] Bertsimas, D. and Niño-Mora, J. (2000). Restless bandits, linear programming relaxations, and a primal-dual index heuristic. Operat. Res. 48, 8090.
[4] Chen, Y. R. and Katehakis, M. N. (1986). Linear programming for finite state multi-armed bandit problems. Math. Operat. Res. 11, 180183.
[5] Dacre, M., Glazebrook, K. D. and Niño-Mora, J. (1999). The achievable region approach to the optimal control of stochastic systems (with discussion). J. R. Statist. Soc. B 61, 747791.
[6] D'Epénoux, F., (1960). Sur un problème de production et de stockage dans l'aléatoire. RAIRO Rech. Opérat. 14, 3–16. English translation: A probabilistic production and inventory problem. Management Sci. 10 (1963), 98108.
[7] Faihe, Y. and Müller, J.-P. (1998). Behaviors coordination using restless bandits allocation indexes. In From Animals to Animats 5 (Proc. 5th Int. Conf. Simulation of Adaptive Behavior), eds Pfeifer, R., Blumberg, B., Meyer, J. A. and Wilson, S. W.. MIT Press, Cambridge, MA.
[8] Gittins, J. C. (1979). Bandit processes and dynamic allocation indices (with discussion). J. R. Statist. Soc. B 41, 148177.
[9] Katehakis, M. N. and Veinott, A. F. Jr (1987). The multi-armed bandit problem: decomposition and computation. Math. Operat. Res. 12, 262268.
[10] Klimov, G. P. (1974). Time sharing service systems I. Theory Prob. Appl. 19, 532551.
[11] Manne, A. S. (1960). Linear programming and sequential decisions. Management Sci. 6, 259267.
[12] Niño-Mora, J. (2000). Admission control to birth–death queueing systems, restless bandit indices, and partial conservation laws. Working paper, Department of Economics and Business, Universitat Pompeu Fabra.
[13] Papadimitriou, C. H. and Tsitsiklis, J. N. (1999). The complexity of optimal queueing network control. Math. Operat. Res. 24, 293305.
[14] Veatch, M. and Wein, L. M. (1996). Scheduling a make-to-stock queue: index policies and hedging points. Operat. Res. 44, 634647.
[15] Weber, R. R. and Weiss, G. (1990). On an index policy for restless bandits. J. Appl. Prob. 27, 637648.
[16] Weber, R. R. and Weiss, G. (1991). Addendum to ‘On an index policy for restless bandits’. Adv. Appl. Prob. 23, 429430.
[17] Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In A Celebration of Applied Probability (J. Appl. Prob. 25A), ed. Gani, J.. Applied Probability Trust, Sheffield, pp. 287298.

Keywords

MSC classification

Related content

Powered by UNSILO

Restless bandits, partial conservation laws and indexability

  • José Niño-Mora (a1)

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.