Skip to main content Accessibility help
×
Home

Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss

  • Bruno Jedynak (a1), Peter I. Frazier (a2) and Raphael Sznitman (a1)

Abstract

We consider the problem of twenty questions with noisy answers, in which we seek to find a target by repeatedly choosing a set, asking an oracle whether the target lies in this set, and obtaining an answer corrupted by noise. Starting with a prior distribution on the target's location, we seek to minimize the expected entropy of the posterior distribution. We formulate this problem as a dynamic program and show that any policy optimizing the one-step expected reduction in entropy is also optimal over the full horizon. Two such Bayes optimal policies are presented: one generalizes the probabilistic bisection policy due to Horstein and the other asks a deterministic set of questions. We study the structural properties of the latter, and illustrate its use in a computer vision application.

    • Send article to Kindle

      To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

      Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss
      Available formats
      ×

      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

      Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss
      Available formats
      ×

      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

      Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss
      Available formats
      ×

Copyright

Corresponding author

Postal address: Department of Applied Mathematics and Statistics, Johns Hopkins University, Whitehead 208-B, 3400 North Charles Street, Baltimore, MD 21218, USA. Email address: bruno.jedynak@jhu.edu
∗∗ Postal address: School of Operations Research and Industrial Engineering, Cornell University, 232 Rhodes Hall, Ithaca, NY 14850, USA.
∗∗∗ Postal address: Johns Hopkins University, Hackerman Hall, 3400 North Charles Street, Baltimore, MD 21218, USA.

Footnotes

Hide All

Research supported in part by AFOSRYIP FA9550-11-1-0083.

Research supported in part by NIH grant R01 EB 007969-01.

Footnotes

References

Hide All
[1] Ben-Or, M. and Hassidim, A. (2008). The Bayesian learner is optimal for noisy binary search (and pretty good for quantum as well). In 2008 49th Ann. IEEE Symp. Foundations of Computer Science. IEEE Computer Society Press, Washington, DC, pp. 221230.
[2] Berry, D. A. and Fristedt, B. (1985). Bandit Problems. Chapman & Hall, London.
[3] Blum, J. R. (1954). Multidimensional stochastic approximation methods. Ann. Math. Statist. 25, 737744.
[4] Burnašhev, M. V. and Zigangirov, K. Š. (1974). A certain problem of interval estimation in observation control. Problemy Peredachi Informatsii 10, 5161.
[5] Castro, R. and Nowak, R. (2008). Active learning and sampling. In Foundations and Applications of Sensor Management, Springer, pp. 177200.
[6] Cover, T. M. and Thomas, J. A. (1991). Elements of Information Theory. John Wiley, New York.
[7] DeGroot, M. H. (1970). Optimal Statistical Decisions. McGraw Hill, New York.
[8] Dynkin, E. B. and Yushkevich, A. A. (1979). Controlled Markov Processes. Springer, New York.
[9] Frazier, P. I., Powell, W. B. and Dayanik, S. (2008). A knowledge-gradient policy for sequential information collection. SIAM J. Control Optimization 47, 24102439.
[10] Geman, D. and Jedynak, B. (1996). An active testing model for tracking roads in satellite images. IEEE Trans. Pattern Anal. Machine Intelligence 18, 114.
[11] Gittins, J. C. (1989). Multi-Armed Bandit Allocation Indices. John Wiley, Chichester.
[12] Horstein, M. (1963). Sequential decoding using noiseless feedback. IEEE Trans. Inf. Theory 9, 136143.
[13] Horstein, M. (2002). Sequential transmission using noiseless feedback. IEEE Trans. Inf. Theory 9, 136143.
[14] Karp, R. M. and Kleinberg, R. (2007). Noisy binary search and its applications. In Proc. 18th Ann. ACM-SIAM Symp. Discrete Algorithms, ACM, New York, pp. 881890.
[15] Kushner, H. J. and Yin, G. G. (2003). Stochastic Approximation and Recursive Algorithms and Applications, 2nd edn. Springer, New York.
[16] Lai, T. L. and Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6, 422.
[17] Lampert, C. H., Blaschko, M. B. and Hofmann, T. (2009). Efficient subwindow search: a branch and bound framework for object localization. IEEE Trans. Pattern Anal. Machine Intelligence 31, 21292142.
[18] Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. Internat. J. Comput. Vision 60, 91110.
[19] Nowak, R. (2008). Generalized binary search. In 2008 46th Ann. Allerton Conf. Commun., Control, and Computing, pp. 568574.
[20] Nowak, R. (2009). Noisy generalized binary search. Adv. Neural Inf. Processing Systems 22, 13661374.
[21] Pelc, A. (2002). Searching games with errors—fifty years of coping with liars. Theoret. Comput. Sci. 270, 71109.
[22] Polyak, B. T. (1990). A new method of stochastic approximation type. Automat. Remote Control 51, 937946.
[23] Robbins, H. (1952). Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. 58, 527535.
[24] Robbins, H. and Monro, S. (1951). A stochastic approximation method. Ann. Math. Satist. 22, 400407.
[25] Ruppert, D. (1988). Efficient estimators from a slowly convergent Robbins-Monro procedure. Tech. Rep. 781, School of Operations Research and Industrial Engineering, Cornell University.
[26] Schapire, R. E. (1990). The strength of weak learnability. Machine Learning 5, 197227.
[27] Sznitman, R. and Jedynak, B. (2010). Active testing for face detection and localization. IEEE Trans. Pattern Anal. Machine Intelligence 32, 19141914.
[28] Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer, New York.
[29] Vedaldi, A., Gulshan, V., Varma, M. and Zisserman, A. (2009). Multiple kernels for object detection. In Proc. Internat. Conf. Computer Vision, pp. 606613.
[30] Viola, P. and Jones, M. J. (2004). Robust real-time face detection. Internat. J. Comput. Vision 57, 137154.
[31] Waeber, R., Frazier, P. I. and Henderson, S. G. (2011). A Bayesian approach to stochastic root finding. In Proc. 2011 Winter Simulation Conference, eds Jain, S. et al., IEEE.
[32] Whittle, P. (1981). Arm-acquiring bandits. Ann. Prob. 9, 284292.
[33] Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In A Celebration of Applied Probability (J. Appl. Prob. Spec. Vol. 25A), ed. Gani, J., Applied Probability Trust, Sheffield, pp. 287298.

Keywords

MSC classification

Related content

Powered by UNSILO

Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss

  • Bruno Jedynak (a1), Peter I. Frazier (a2) and Raphael Sznitman (a1)

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.