Independently Expiring Multiarmed Bandits

Rhonda Righter; J. George Shanthikumar

doi:10.1017/S0269964800005325

Independently Expiring Multiarmed Bandits

Published online by Cambridge University Press: 27 July 2009

Rhonda Righter and

J. George Shanthikumar

Show author details

Rhonda Righter: Affiliation:
Department of Operations and Management Information Systems, Santa Clara University, Santa Clara, California 95053
J. George Shanthikumar: Affiliation:
Department of Industrial Engineering and Operations Research and Walter A. Haas School of BusinessUniversity of California, Berkeley, Berkeley, California 94720

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We give conditions on the optimality of an index policy for multiarmed bandits when arms expire independently. We also give a new simple proof of the optimality of the Gittins index policy for the classic multiarmed bandit problem.

Type: Research Article
Information: Probability in the Engineering and Informational Sciences , Volume 12 , Issue 4 , October 1998 , pp. 453 - 468

DOI: https://doi.org/10.1017/S0269964800005325 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1998

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1.Gittins, J.C. (1989). Multi-armed bandit allocation indices. New York: J. Wiley and Sons.Google Scholar

2.Gitlins, J.C. & Glazebrook, K.D. (1977). On Bayesian models in stochastic scheduling. Journal of Applied Probability 14: 556–565.CrossRef Google Scholar

3.Gittins, J.C. & Jones, D.M. (1974). A dynamic allocation index for the sequential design of experiments. In Gani, J.et al. (eds.), Progress in statistics. Amsterdam: North Holland, pp. 241–266.Google Scholar

4.Gittins, J.C. & Nash, P. (1977). Scheduling, queues and dynamic allocation indices. Proceedings of the 1974 European Meeting of Statisticians. Prague: Academy of Sciences, pp. 191–202.Google Scholar

5.Glazebrook, K.D. (1976). Stochastic scheduling with order constraints. International Journal of Systems Science 7: 657–666.CrossRef Google Scholar

6.Ishikida, T. & Wan, Y.-W. (1997). Scheduling jobs that are subject to deterministic due dates and have deteriorating expected rewards. Probability in the Engineering and Informational Sciences 11: 65–78.CrossRef Google Scholar

7.Ross, S.M. (1983). Introduction to stochastic dynamic programming. New York: Academic Press.Google Scholar

8.Varaiya, P., Walrand, J., & Buyukkoc, C. (1985). Extensions of the multiarmed bandit problem: The discounted case. IEEE Transactions on Automatic Control AC-30: 426–436.CrossRef Google Scholar

9.Weber, R.R. (1992). On the Gittins index for multiarmed bandits. Annals of Applied Probability 2: 1024–1033.CrossRef Google Scholar

10.Weber, R.R. & Weiss, G. (1990). On an index policy for restless bandits. Journal of Applied Probability 27: 637–648.CrossRef Google Scholar

11.Weber, R.R. & Weiss, G. (1991). Addendum to ‘On an index policy for restless bandits’. Advances in Applied Probability 23: 429–430.CrossRef Google Scholar

12.Weiss, G. (1988). Branding bandit processes. Probability in the Engineering and Informational Sciences 2: 269–278.CrossRef Google Scholar

13.Whittle, P. (1980). Multiarmed bandits and the Gittins index. Journal of the Royal Statistical Society Series B 42: 143–149.Google Scholar

14.Whittle, P. (1981). Arm acquiring bandits. Annals of Probability 9: 284–292.CrossRef Google Scholar

15.Whittle, P. (1988). Restless bandits: Activity allocation in a changing world. In Gani, J. (ed.), Celebration of applied probability. Journal of Applied Probability 25A: 287–298.CrossRef Google Scholar

Article contents

Independently Expiring Multiarmed Bandits

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests