Book contents
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- 2 MapReduce and Its Application to Massively Parallel Learning of Decision Tree Ensembles
- 3 Large-Scale Machine Learning Using DryadLINQ
- 4 IBM Parallel Machine Learning Toolbox
- 5 Uniformly Fine-Grained Data-Parallel Computing for Machine Learning Algorithms
- Part Two Supervised and Unsupervised Learning Algorithms
- Part Three Alternative Learning Settings
- Part Four Applications
- Subject Index
- References
5 - Uniformly Fine-Grained Data-Parallel Computing for Machine Learning Algorithms
from Part One - Frameworks for Scaling Up Machine Learning
Published online by Cambridge University Press: 05 February 2012
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- 2 MapReduce and Its Application to Massively Parallel Learning of Decision Tree Ensembles
- 3 Large-Scale Machine Learning Using DryadLINQ
- 4 IBM Parallel Machine Learning Toolbox
- 5 Uniformly Fine-Grained Data-Parallel Computing for Machine Learning Algorithms
- Part Two Supervised and Unsupervised Learning Algorithms
- Part Three Alternative Learning Settings
- Part Four Applications
- Subject Index
- References
Summary
The graphics processing unit (GPU) of modern computers has evolved into a powerful, general-purpose, massively parallel numerical (co-)processor. The numerical computation in a number of machine learning algorithms fits well on the GPU. To help identify such algorithms, we present uniformly fine-grained data-parallel computing and illustrate it on two machine learning algorithms, clustering and regression clustering, on a GPU and central processing unit (CPU) mixed computing architecture. We discuss the key issues involved in a successful design of the algorithms, data structures, and computation partitioning between a CPU and a GPU. Performance gains on a CPU and GPU mixed architecture are compared with the performance of the regression clustering algorithm implemented completely on a CPU. Significant speedups are reported. A GPU and CPU mixed architecture also achieves better cost-performance and energy-performance ratios.
The computing power of the CPU has increased dramatically in the past few decades, supported by both miniaturization and increasing clock frequencies. More and more electronic gates were packed onto the same area of a silicon die as miniaturization continued. Hardware-supported parallel computing, pipelining for example, further increased the computing power of CPUs. Frequency increases speeded up CPUs even more directly. However, the long-predicted physical limit of the miniaturization process was finally hit a few years ago such that increasing the frequency was no longer feasible due to the accompanied nonlinear increase in power consumption, even though miniaturization still continues.
- Type
- Chapter
- Information
- Scaling up Machine LearningParallel and Distributed Approaches, pp. 89 - 106Publisher: Cambridge University PressPrint publication year: 2011