Book contents
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- Part Two Supervised and Unsupervised Learning Algorithms
- 6 PSVM: Parallel Support Vector Machines with Incomplete Cholesky Factorization
- 7 Massive SVM Parallelization Using Hardware Accelerators
- 8 Large-Scale Learning to Rank Using Boosted Decision Trees
- 9 The Transform Regression Algorithm
- 10 Parallel Belief Propagation in Factor Graphs
- 11 Distributed Gibbs Sampling for Latent Variable Models
- 12 Large-Scale Spectral Clustering with Map Reduce and MPI
- 13 Parallelizing Information-Theoretic Clustering Methods
- Part Three Alternative Learning Settings
- Part Four Applications
- Subject Index
- References
7 - Massive SVM Parallelization Using Hardware Accelerators
from Part Two - Supervised and Unsupervised Learning Algorithms
Published online by Cambridge University Press: 05 February 2012
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- Part Two Supervised and Unsupervised Learning Algorithms
- 6 PSVM: Parallel Support Vector Machines with Incomplete Cholesky Factorization
- 7 Massive SVM Parallelization Using Hardware Accelerators
- 8 Large-Scale Learning to Rank Using Boosted Decision Trees
- 9 The Transform Regression Algorithm
- 10 Parallel Belief Propagation in Factor Graphs
- 11 Distributed Gibbs Sampling for Latent Variable Models
- 12 Large-Scale Spectral Clustering with Map Reduce and MPI
- 13 Parallelizing Information-Theoretic Clustering Methods
- Part Three Alternative Learning Settings
- Part Four Applications
- Subject Index
- References
Summary
Support Vector Machines (SVMs) are some of the most widely used classification and regression algorithms for data analysis, pattern recognition, or cognitive tasks. Yet learning problems that can be solved by SVMs are limited in size because of high computational cost and excessive storage requirements. Many variations of the original SVM algorithm were introduced that scale better to large problems. They change the SVM framework quite drastically, such as apply optimizations other than the maximum margin, or introduce different error metrics for the cost function. Such algorithms may work for some applications, but they do not have the robustness and universality that make SVMs so popular.
The approach taken here is to maintain the SVM algorithm in its original form and scale it to large problems through parallelization. Computer performance cannot be improved anymore at the pace of the last few decades by increasing the clock frequencies. Today, significant accelerations are achieved mostly through parallel architectures, and multicore processors are commonplace nowadays. Mapping the SVM algorithm to multicore processors with shared-memory architectures is straightforward, yet this approach does not scale to a large number of processors. Here we investigate parallelization concepts that scale to hundreds and thousands of cores where, for example, cache coherence can no longer be maintained.
A number of SVM implementations on clusters or graphics processors (GPUs) have been proposed recently. A parallel optimization algorithm based on gradient projections has been demonstrated (see Zanghirati, and Zanni, 2003; Zanni, Serafini, and Zanghirati, 2006) that uses a spectral gradient method for fast convergence while maintaining the Karush-Kuhn-Tucker (KKT) constraints.
- Type
- Chapter
- Information
- Scaling up Machine LearningParallel and Distributed Approaches, pp. 127 - 147Publisher: Cambridge University PressPrint publication year: 2011