Skip to main content Accessibility help
×
Hostname: page-component-7c8c6479df-xxrs7 Total loading time: 0 Render date: 2024-03-28T19:10:36.372Z Has data issue: false hasContentIssue false

5 - Uniformly Fine-Grained Data-Parallel Computing for Machine Learning Algorithms

from Part One - Frameworks for Scaling Up Machine Learning

Published online by Cambridge University Press:  05 February 2012

Meichun Hsu
Affiliation:
HP Labs, Palo Alto, CA, USA
Ren Wu
Affiliation:
HP Labs, Palo Alto, CA, USA
Bin Zhang
Affiliation:
HP Labs, Palo Alto, CA, USA
Ron Bekkerman
Affiliation:
LinkedIn Corporation, Mountain View, California
Mikhail Bilenko
Affiliation:
Microsoft Research, Redmond, Washington
John Langford
Affiliation:
Yahoo! Research, New York
Get access

Summary

The graphics processing unit (GPU) of modern computers has evolved into a powerful, general-purpose, massively parallel numerical (co-)processor. The numerical computation in a number of machine learning algorithms fits well on the GPU. To help identify such algorithms, we present uniformly fine-grained data-parallel computing and illustrate it on two machine learning algorithms, clustering and regression clustering, on a GPU and central processing unit (CPU) mixed computing architecture. We discuss the key issues involved in a successful design of the algorithms, data structures, and computation partitioning between a CPU and a GPU. Performance gains on a CPU and GPU mixed architecture are compared with the performance of the regression clustering algorithm implemented completely on a CPU. Significant speedups are reported. A GPU and CPU mixed architecture also achieves better cost-performance and energy-performance ratios.

The computing power of the CPU has increased dramatically in the past few decades, supported by both miniaturization and increasing clock frequencies. More and more electronic gates were packed onto the same area of a silicon die as miniaturization continued. Hardware-supported parallel computing, pipelining for example, further increased the computing power of CPUs. Frequency increases speeded up CPUs even more directly. However, the long-predicted physical limit of the miniaturization process was finally hit a few years ago such that increasing the frequency was no longer feasible due to the accompanied nonlinear increase in power consumption, even though miniaturization still continues.

Type
Chapter
Information
Scaling up Machine Learning
Parallel and Distributed Approaches
, pp. 89 - 106
Publisher: Cambridge University Press
Print publication year: 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

AccelerEyes, . 2010. Jacket GPU Software for Matlab. www.accelereyes.com.
Allison, P. D. 1999. Multiple Regression: A Primer.Thousand Oaks, CA: SAGE.Google Scholar
Bell, N., and Garland, M. 2008. Efficient Sparse Matrix-Vector Multiplication on CUDA.http://www.nvidia.com/object/nvidia research pub 001.html.
Bradley, P., and Fayyad, U. M. 1998. Refining Initial Points for KM Clustering. Technical Report MSR-TR-98-36.Google Scholar
Che, S. 2007. A Performance Study of General Purpose Application on Graphics Processors. Workshop on GPGPU, Boston.Google Scholar
Che, S. 2008. A Performance Study of General-Purpose Application on Graphics Processors Using CUDA. Journal of Parallel and Distributed Computing.CrossRefGoogle Scholar
DeSarbo, W. S., and Corn, L. W. 1988. A Maximum Likelihood Methodology for Cluterwise Linear Regression. Journal of Classification.CrossRefGoogle Scholar
Friedman, J., Hastie, T., and Tibshirani, R. 1998. Additive Logistic Regression: A Statistical View of Boosting. Technical Report, Department of Statistics, Sequoia Hall, Stanford Univerity.Google Scholar
Hennig, C. 1997. Datenanalyse mit Modellen Fur Cluster Linear Regression. Dissertation, Institut Fur Mathmatsche Stochastik, Universitat Hamburg.Google Scholar
KhronosGroup. 2010. OpenCL – The Open Standard for Parallel Programming of Heterogeneous Systems.http://www.khronos.org/opencl/.
Montgomery, D. C., Peck, E. A., and Vining, G. G. 2001. Introduction to Linear Regression Analysis, 3rd Edition. New York: Wiley.Google Scholar
NVIDIA. 2008. CUDA Occupancy Calculator.http://news.developer.nvidia.com/2007/03/cudaoccupancy.html.
Pena, J., Lozano, J., and Larranaga, P. 1999. An Empirical Comparison of Four InitializationMethods for the K-means Algorithm. Pattern Recognition Letters.Google Scholar
Pisharath, J. 2005. NU-MineBench 2.0. Technical Report CUCIS-2005-08-01, Northwestern University.Google Scholar
Spath, H. 1981. Correction to Algorithm 39: Clusterwise Linear Regression. Computing.CrossRefGoogle Scholar
Spath, H. 1982. Algorithm 48: A Fast Algorithm for Clusterwise Linear Regression. Computing.CrossRefGoogle Scholar
Spath, H. 1985. Cluster Dissection and Analysis.New York: Wiley.Google Scholar
Wright, R. S., Haemel, N., Sellers, G., and Lipchak, B. 2010. OpenGL SuperBible: Comprehensive Tutorial and Reference, Edwards Brothers.
Wu, R., Zhang, B., and Hsu, M. 2009a. Clustering Billions of Data Points Using GPUs. ACM UCHPC09: Second Workshop on UnConventional High Performance Computing.Google Scholar
Wu, R., Zhang, B., and Hsu, M. 2009 b. GPU-Accelerated Large Scale Analytics. HP Labs Technical Report, HPL-2009-38. http://www.hpl.hp.com/techreports/2009/HPL-2009-38.html.
Zhang, B. 2003. Regression clustering. ICDM.Google Scholar
Zhang, B. 2005. Center-based Clustering and Regression Clustering. Encyclopedia of Data Warehousing and Mining.CrossRefGoogle Scholar
Zhang, B., Hsu, M., and Forman, G. 2000. Accurate Recasting of Parameter Estimation Algorithms Using Sufficient Statistics for Efficient Parallel Speed-up: Demonstrated for Center-based Data Clustering Algorithms. Pages 243–254 of: Proceedings of PAKDD.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×