Book contents
- Frontmatter
- Contents
- Preface
- Notation
- 1 The Learning Methodology
- 2 Linear Learning Machines
- 3 Kernel-Induced Feature Spaces
- 4 Generalisation Theory
- 5 Optimisation Theory
- 6 Support Vector Machines
- 7 Implementation Techniques
- 8 Applications of Support Vector Machines
- A Pseudocode for the SMO Algorithm
- B Background Mathematics
- References
- Index
6 - Support Vector Machines
Published online by Cambridge University Press: 05 March 2013
- Frontmatter
- Contents
- Preface
- Notation
- 1 The Learning Methodology
- 2 Linear Learning Machines
- 3 Kernel-Induced Feature Spaces
- 4 Generalisation Theory
- 5 Optimisation Theory
- 6 Support Vector Machines
- 7 Implementation Techniques
- 8 Applications of Support Vector Machines
- A Pseudocode for the SMO Algorithm
- B Background Mathematics
- References
- Index
Summary
The material covered in the first five chapters has given us the foundation on which to introduce Support Vector Machines, the learning approach originally developed by Vapnik and co-workers. Support Vector Machines are a system for efficiently training the linear learning machines introduced in Chapter 2 in the kernel-induced feature spaces described in Chapter 3, while respecting the insights provided by the generalisation theory of Chapter 4, and exploiting the optimisation theory of Chapter 5. An important feature of these systems is that, while enforcing the learning biases suggested by the generalisation theory, they also produce ‘sparse’ dual representations of the hypothesis, resulting in extremely efficient algorithms. This is due to the Karush–Kuhn–Tucker conditions, which hold for the solution and play a crucial role in the practical implementation and analysis of these machines. Another important feature of the Support Vector approach is that due to Mercer's conditions on the kernels the corresponding optimisation problems are convex and hence have no local minima. This fact, and the reduced number of non-zero parameters, mark a clear distinction between these system and other pattern recognition algorithms, such as neural networks. This chapter will also describe the optimisation required to implement the Bayesian learning strategy using Gaussian processes.
Support Vector Classification
The aim of Support Vector classification is to devise a computationally efficient way of learning ‘good’ separating hyperplanes in a high dimensional feature space, where by ‘good’ hyperplanes we will understand ones optimising the generalisation bounds described in Chapter 4, and by ‘computationally efficient’ we will mean algorithms able to deal with sample sizes of the order of 100000 instances.
- Type
- Chapter
- Information
- Publisher: Cambridge University PressPrint publication year: 2000
- 104
- Cited by