Skip to main content Accessibility help
×
Hostname: page-component-76fb5796d-dfsvx Total loading time: 0 Render date: 2024-04-26T06:20:10.294Z Has data issue: false hasContentIssue false

11 - Distributed Gibbs Sampling for Latent Variable Models

from Part Two - Supervised and Unsupervised Learning Algorithms

Published online by Cambridge University Press:  05 February 2012

Arthur Asuncion
Affiliation:
University of California
Padhraic Smyth
Affiliation:
University of California
Max Welling
Affiliation:
University of California
David Newman
Affiliation:
University of California
Ian Porteous
Affiliation:
Google Inc., Kirkland, WA, USA
Scott Triglia
Affiliation:
University of California
Ron Bekkerman
Affiliation:
LinkedIn Corporation, Mountain View, California
Mikhail Bilenko
Affiliation:
Microsoft Research, Redmond, Washington
John Langford
Affiliation:
Yahoo! Research, New York
Get access

Summary

In this chapter, we address distributed learning algorithms for statistical latent variable models, with a focus on topic models. Many high-dimensional datasets, such as text corpora and image databases, are too large to allow one to learn topic models on a single computer. Moreover, a growing number of applications require that inference be fast or in real time, motivating the exploration of parallel and distributed learning algorithms.

We begin by reviewing topic models such as Latent Dirichlet Allocation and Hierarchical Dirichlet Processes. We discuss parallel and distributed algorithms for learning these models and show that these algorithms can achieve substantial speedups without sacrificing model quality. Next we discuss practical guidelines for running our algorithms within various parallel computing frameworks and highlight complementary speedup techniques. Finally, we generalize our distributed approach to handle Bayesian networks.

Several of the results in this chapter have appeared in previous papers in the specific context of topic modeling. The goal of this chapter is to present a comprehensive overview of distributed inference algorithms and to extend the general ideas to a broader class of Bayesian networks.

Latent Variable Models

Latent variable models are a class of statistical models that explain observed data with latent (or hidden) variables. Topic models and hidden Markov models are two examples of such models, where the latent variables are the topic assignment variables and the hidden states, respectively. Given observed data, the goal is to perform Bayesian inference over the latent variables and use the learned model to make inferences or predictions.

Type
Chapter
Information
Scaling up Machine Learning
Parallel and Distributed Approaches
, pp. 217 - 239
Publisher: Cambridge University Press
Print publication year: 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Asuncion, A., Smyth, P., and Welling, M. 2009a. Asynchronous Distributed Learning of Topic Models. Pages 81–88 of: Advances in Neural Information Processing Systems 21.Google Scholar
Asuncion, A., Welling, M., Smyth, P., and Teh, Y. W. 2009b. On Smoothing and Inference for Topic Models. Pages 27–34 of: Proceedings of the Twenty-Fifth Annual Conference on Uncertainty in Artificial Intelligence (UAI-09). Corvallis, OR: AUAI Press.Google Scholar
Asuncion, A., Smyth, P., and Welling, M. 2011. Asynchronous Distributed Estimation of Topic Models for Document Analysis. Statistical Methodology, 8(1), 3–17.CrossRefGoogle Scholar
Bidyuk, B., and Dechter, R. 2003. Cycle-Cutset Sampling for Bayesian Networks. Pages 297–312 of: Advances in Artificial Intelligence, 16th Conference of the Canadian Society for Computational Studies of Intelligence, Vol. 2671.Google Scholar
Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993–1022.Google Scholar
Boyd, S. P., Ghosh, A., Prabhakar, B., and Shah, D. 2005. Gossip Algorithms: Design, Analysis and Applications. Pages 1653–1664 of: Proceedings of INFOCOM: 24th Annual Joint Conference of the IEEE Computer and Communications Societies, vol. 3. IEEE.Google Scholar
Buntine, W., and Jakulin, A. 2006. Discrete Component Analysis. Lecture Notes in Computer Science, 3940, 1–33.CrossRefGoogle Scholar
Chien, J. T., and Wu, M. S. 2008. Adaptive Bayesian Latent Semantic Analysis. IEEE Transactions on Audio, Speech, and Language Processing, 16(1), 198–207.CrossRefGoogle Scholar
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R. 1990. Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science, 41(6), 391–407.3.0.CO;2-9>CrossRefGoogle Scholar
Doshi-Velez, F., Knowles, D., Mohamed, S., and Ghahramani, Z. 2009. Large Scale Nonparametric Bayesian Inference: Data Parallelisation in the Indian Buffet Process. Pages 1294–1302 of: Advances in Neural Information Processing Systems 22.Google Scholar
Frank, A., and Asuncion, A. 2007. UCI Machine Learning Repository. www.ics.uci.edu/~mlearn/MLRepository.html.Google Scholar
Griffiths, T. L., and Steyvers, M. 2004. Finding Scientific Topics. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl 1), 5228–5235.CrossRefGoogle ScholarPubMed
Griffiths, T. L., Steyvers, M., Blei, D. M., and Tenenbaum, J. B. 2005. Integrating Topics and Syntax. Pages 537–544 of: Advances in Neural Information Processing Systems 17. Cambridge, MA: MIT Press.Google Scholar
Gruber, A., Rosen-Zvi, M., and Weiss, Y. 2007. Hidden Topic Markov Models. Pages 163–170 of: AISTATS'07: Proceedings of 11th International Conference on Artificial Intelligence and Statistics.Google Scholar
Hofmann, T. 2001. Unsupervised Learning by Probabilistic Latent Semantic Analysis. Machine Learning, 42(1), 177–196.CrossRefGoogle Scholar
Ihler, A., and Newman, D. 2009. Bounding Sample Errors in Approximate Distributed Latent Dirichlet Allocation. Large Scale Machine Learning Workshop, NIPS. UCI ICS Technical Report 09-06, www.ics.uci.edu/~ihler/papers/tr09-06.pdf.Google Scholar
Jelinek, F., Mercer, R. L., Bahl, L. R., and Baker, J. K. 1977. Perplexity – a Measure of the Difficulty of Speech Recognition Tasks. Journal of the Acoustical Society of America, 62, S63.CrossRefGoogle Scholar
Jolliffe, I. T. 2002. Principal Component Analysis, 2nd ed. New York: Springer.Google Scholar
Li, F. F., and Perona, P. 2005. A Bayesian Hierarchical Model for Learning Natural Scene Categories. Pages 524–531 of: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), vol. 2. IEEE Computer Society.Google Scholar
Lowe, D. G. 2004. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 60(2), 91–110.CrossRefGoogle Scholar
Masada, T., Hamada, T., Shibata, Y., and Oguri, K. 2009. Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation with NVIDIA CUDA Compatible Devices. Pages 491–500 of: Proceedings of the 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: Next-Generation Applied Intelligence. New York: Springer.Google Scholar
Minka, T., and Lafferty, J. 2002. Expectation-Propagation for the Generative Aspect Model. Pages 352–359 of: Proceedings of the Eighteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-02). San Francisco, CA: Morgan Kaufmann.Google Scholar
Nallapati, R., Cohen, W., and Lafferty, J. 2007. Parallelized Variational EM for Latent Dirichlet Allocation: An Experimental Evaluation of Speed and Scalability. Pages 349–354 of: Proceedings of the Seventh IEEE International Conference on Data Mining Workshops. Washington, DC: IEEE Computer Society.
Newman, D., Asuncion, A., Smyth, P., and Welling, M. 2008. Distributed Inference for Latent Dirichlet Allocation. Pages 1081–1088 of: Advances in Neural Information Processing Systems 20. Cambridge, MA: MIT Press.Google Scholar
Newman, D., Asuncion, A., Smyth, P., and Welling, M. 2009. Distributed Algorithms for Topic Models. Journal of Machine Learning Research, 10, 1801–1828.Google Scholar
Pearl, J. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Francisco, CA: Morgan Kaufmann.Google Scholar
Porteous, I., Newman, D., Ihler, A., Asuncion, A., Smyth, P., and Welling, M. 2008a. Fast Collapsed Gibbs Sampling for Latent Dirichlet Allocation. Pages 569–577 of: KDD'08: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM.CrossRefGoogle Scholar
Porteous, I., Bart, E., and Welling, M. 2008b. Multi-HDP: A Non Parametric Bayesian Model for Tensor Factorization. Pages 1487–1490 of: AAAI'08: Proceedings of the 23rd National Conference on Artificial Intelligence. AAAI Press.
Pritchard, J. K., Stephens, M., and Donnelly, P. 2000. Inference of Population Structure using Multilocus Genotype Data. Genetics, 155, 945–959.Google ScholarPubMed
Rabiner, L. R. 1990. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Readings in Speech Recognition, 53(3), 267–296.CrossRefGoogle Scholar
Scott, S. L. 2002. Bayesian Methods for Hidden Markov Models: Recursive Computing in the 21st Century. Journal of the American Statistical Association, 97(457), 337–352.CrossRefGoogle Scholar
Smola, A., and Narayanamurthy, S. 2010. An Architecture for Parallel TopicModels. Pages 703–710 at: Very Large Databases (VLDB).
Smyth, P., Heckerman, D., and Jordan, M. I. 1997. Probabilistic Independence Networks for Hidden Markov Probability Models. Neural Computation, 9(2), 227–269.CrossRefGoogle ScholarPubMed
Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M. 2006. Hierarchical Dirichlet Processes. Journal of the American Statistical Association, 101(476), 1566–1581.CrossRefGoogle Scholar
Teh, Y. W., Newman, D., and Welling, M. 2007. A Collapsed Variational Bayesian Inference Algorithm for Latent DIrichlet Allocation. Pages 1353–1360 of: Advances in Neural Information Processing Systems 19. Cambridge, MA: MIT Press.Google Scholar
Wallach, H. M. 2006. Topic Modeling: Beyond Bag-of-Words. Pages 977–984 of: ICML'06: Proceedings of the 23rd International Conference on Machine Learning. New York: ACM.CrossRefGoogle Scholar
Wang, Y., Bai, H., Stanton, M., Chen, W. Y., and Chang, E. Y. 2009. PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications. Pages 301–314 of: AAIM'09: Proceedings of the 5th International Conference on Algorithmic Aspects in Information and Management. Berlin: Springer.
Welling, M., Teh, Y. W., and Kappen, H. 2008a. Hybrid Variational/Gibbs Collapsed Inference in Topic Models. Pages 587–594 of: Proceedings of the Twenty-Fourth Annual Conference on Uncertainty in Artificial Intelligence (UAI-08). Corvallis, OR: AUAI Press.Google Scholar
Welling, M., Porteous, I., and Bart, E. 2008b. Infinite State Bayes-nets for Structured Domains. Pages 1601–1608 of: Advances in Neural Information Processing Systems 20. Cambridge, MA: MIT Press.Google Scholar
Wolfe, J., Haghighi, A., and Klein, D. 2008. Fully Distributed EM for Very Large Datasets. Pages 1184–1191 of: ICML'08: Proceedings of the 25th International Conference on Machine Learning. New York: ACM.CrossRefGoogle Scholar
Yan, F., Xu, N., and Qi, Y. 2009. Parallel Inference for Latent Dirichlet Allocation on Graphics Processing Units. Pages 2134–2142 of: Advances in Neural Information Processing Systems 22.Google Scholar
Yao, L., Mimno, D., and McCallum, A. 2009. Efficient Methods for Topic Model Inference on Streaming Document Collections. Pages 937–946 of: KDD '09: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM.CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×