Book contents
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- Part Two Supervised and Unsupervised Learning Algorithms
- 6 PSVM: Parallel Support Vector Machines with Incomplete Cholesky Factorization
- 7 Massive SVM Parallelization Using Hardware Accelerators
- 8 Large-Scale Learning to Rank Using Boosted Decision Trees
- 9 The Transform Regression Algorithm
- 10 Parallel Belief Propagation in Factor Graphs
- 11 Distributed Gibbs Sampling for Latent Variable Models
- 12 Large-Scale Spectral Clustering with Map Reduce and MPI
- 13 Parallelizing Information-Theoretic Clustering Methods
- Part Three Alternative Learning Settings
- Part Four Applications
- Subject Index
- References
10 - Parallel Belief Propagation in Factor Graphs
from Part Two - Supervised and Unsupervised Learning Algorithms
Published online by Cambridge University Press: 05 February 2012
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- Part Two Supervised and Unsupervised Learning Algorithms
- 6 PSVM: Parallel Support Vector Machines with Incomplete Cholesky Factorization
- 7 Massive SVM Parallelization Using Hardware Accelerators
- 8 Large-Scale Learning to Rank Using Boosted Decision Trees
- 9 The Transform Regression Algorithm
- 10 Parallel Belief Propagation in Factor Graphs
- 11 Distributed Gibbs Sampling for Latent Variable Models
- 12 Large-Scale Spectral Clustering with Map Reduce and MPI
- 13 Parallelizing Information-Theoretic Clustering Methods
- Part Three Alternative Learning Settings
- Part Four Applications
- Subject Index
- References
Summary
Probabilistic graphical models are used in a wide range of machine learning applications. From reasoning about protein interactions (Jaimovich et al., 2006) to stereo vision (Sun, Shum, and Zheng, 2002), graphical models have facilitated the application of probabilistic methods to challenging machine learning problems. A core operation in probabilistic graphical models is inference – the process of computing the probability of an event given particular observations. Although inference is NP-complete in general, there are several popular approximate inference algorithms that typically perform well in practice. Unfortunately, the approximate inference algorithms are still computationally intensive and therefore can benefit from parallelization. In this chapter, we parallelize loopy belief propagation (or loopy BP in short), which is used in a wide range of ML applications (Jaimovich et al., 2006; Sun et al., 2002; Lan et al., 2006; Baron, Sarvotham, and Baraniuk, 2010; Singla and Domingos, 2008).
We begin by briefly reviewing the sequential BP algorithm as well as the necessary background in probabilistic graphical models. We then present a collection of parallel shared memory BP algorithms that demonstrate the importance of scheduling in parallel BP. Next, we develop the Splash BP algorithm, which combines new scheduling ideas to address the limitations of existing sequential BP algorithms and achieve theoretically optimal parallel performance. Finally, we present how to efficiently implement loopy BP algorithms in the distributed parallel setting by addressing the challenges of distributed state and load balancing.
- Type
- Chapter
- Information
- Scaling up Machine LearningParallel and Distributed Approaches, pp. 190 - 216Publisher: Cambridge University PressPrint publication year: 2011