Book contents
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- Part Two Supervised and Unsupervised Learning Algorithms
- Part Three Alternative Learning Settings
- Part Four Applications
- 18 Large-Scale Learning for Vision with GPUs
- 19 Large-Scale FPGA-Based Convolutional Networks
- 20 Mining Tree-Structured Data on Multicore Systems
- 21 Scalable Parallelization of Automatic Speech Recognition
- Subject Index
- References
20 - Mining Tree-Structured Data on Multicore Systems
from Part Four - Applications
Published online by Cambridge University Press: 05 February 2012
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- Part Two Supervised and Unsupervised Learning Algorithms
- Part Three Alternative Learning Settings
- Part Four Applications
- 18 Large-Scale Learning for Vision with GPUs
- 19 Large-Scale FPGA-Based Convolutional Networks
- 20 Mining Tree-Structured Data on Multicore Systems
- 21 Scalable Parallelization of Automatic Speech Recognition
- Subject Index
- References
Summary
Mining frequent subtrees in a database of rooted and labeled trees is an important problem in many domains, ranging from phylogenetic analysis to biochemistry and from linguistic parsing to XML data analysis. In this work, we revisit this problem and develop an architecture-conscious solution targeting emerging multicore systems. Specifically, we identify a sequence of memory-related optimizations that significantly improve the spatial and temporal locality of a state-of-the-art sequential algorithm – alleviating the effects of memory latency. Additionally, these optimizations are shown to reduce the pressure on the front-side bus, an important consideration in the context of large-scale multicore architectures. We then demonstrate that these optimizations, although necessary, are not sufficient for efficient parallelization on multicores, primarily because of parametric and data-driven factors that make load balancing a significant challenge. To address this challenge, we present a methodology that adaptively and automatically modulates the type and granularity of the work being shared among different cores. The resulting algorithm achieves near perfect parallel efficiency on up to 16 processors on challenging real-world applications. The optimizations we present have general-purpose utility, and a key outcome is the development of a generalpurpose scheduling service for moldable task scheduling on emerging multicore systems.
The field of knowledge discovery is concerned with extracting actionable knowledge from data efficiently. Although most of the early work in this field focused on mining simple transactional datasets, recently there has been a significant shift toward analyzing data with complex structure such as trees and graphs.
- Type
- Chapter
- Information
- Scaling up Machine LearningParallel and Distributed Approaches, pp. 420 - 445Publisher: Cambridge University PressPrint publication year: 2011
References
- 1
- Cited by