Book contents
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- Part Two Supervised and Unsupervised Learning Algorithms
- Part Three Alternative Learning Settings
- Part Four Applications
- 18 Large-Scale Learning for Vision with GPUs
- 19 Large-Scale FPGA-Based Convolutional Networks
- 20 Mining Tree-Structured Data on Multicore Systems
- 21 Scalable Parallelization of Automatic Speech Recognition
- Subject Index
- References
19 - Large-Scale FPGA-Based Convolutional Networks
from Part Four - Applications
Published online by Cambridge University Press: 05 February 2012
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- Part Two Supervised and Unsupervised Learning Algorithms
- Part Three Alternative Learning Settings
- Part Four Applications
- 18 Large-Scale Learning for Vision with GPUs
- 19 Large-Scale FPGA-Based Convolutional Networks
- 20 Mining Tree-Structured Data on Multicore Systems
- 21 Scalable Parallelization of Automatic Speech Recognition
- Subject Index
- References
Summary
Micro-robots, unmanned aerial vehicles, imaging sensor networks, wireless phones, and other embedded vision systems all require low cost and high-speed implementations of synthetic vision systems capable of recognizing and categorizing objects in a scene.
Many successful object recognition systems use dense features extracted on regularly spaced patches over the input image. The majority of the feature extraction systems have a common structure composed of a filter bank (generally based on oriented edge detectors or 2D Gabor functions), a nonlinear operation (quantization, winner-take-all, sparsification, normalization, and/or pointwise saturation), and finally a pooling operation (max, average, or histogramming). For example, the scale-invariant feature transform (SIFT) (Lowe, 2004) operator applies oriented edge filters to a small patch and determines the dominant orientation through a winner-take-all operation. Finally, the resulting sparse vectors are added (pooled) over a larger patch to form a local orientation histogram. Some recognition systems use a single stage of feature extractors (Lazebnik, Schmid, and Ponce, 2006; Dalal and Triggs, 2005; Berg, Berg, and Malik, 2005; Pinto, Cox, and DiCarlo, 2008).
Other models such as HMAX-type models (Serre, Wolf, and Poggio, 2005; Mutch, and Lowe, 2006) and convolutional networks use two more layers of successive feature extractors. Different training algorithms have been used for learning the parameters of convolutional networks. In LeCun et al. (1998b) and Huang and LeCun (2006), pure supervised learning is used to update the parameters. However, recent works have focused on training with an auxiliary task (Ahmed et al., 2008) or using unsupervised objectives (Ranzato et al., 2007b; Kavukcuoglu et al., 2009; Jarrett et al., 2009; Lee et al., 2009).
- Type
- Chapter
- Information
- Scaling up Machine LearningParallel and Distributed Approaches, pp. 399 - 419Publisher: Cambridge University PressPrint publication year: 2011
References
- 75
- Cited by