Support Vector Machines

Nello Cristianini; John Shawe-Taylor

doi:10.1017/CBO9780511801389.008

6 - Support Vector Machines

Published online by Cambridge University Press: 05 March 2013

Nello Cristianini and

John Shawe-Taylor

Show author details

Nello Cristianini: Affiliation:
University of London
John Shawe-Taylor: Affiliation:
Royal Holloway, University of London

Book contents

Get access

Summary

The material covered in the first five chapters has given us the foundation on which to introduce Support Vector Machines, the learning approach originally developed by Vapnik and co-workers. Support Vector Machines are a system for efficiently training the linear learning machines introduced in Chapter 2 in the kernel-induced feature spaces described in Chapter 3, while respecting the insights provided by the generalisation theory of Chapter 4, and exploiting the optimisation theory of Chapter 5. An important feature of these systems is that, while enforcing the learning biases suggested by the generalisation theory, they also produce ‘sparse’ dual representations of the hypothesis, resulting in extremely efficient algorithms. This is due to the Karush–Kuhn–Tucker conditions, which hold for the solution and play a crucial role in the practical implementation and analysis of these machines. Another important feature of the Support Vector approach is that due to Mercer's conditions on the kernels the corresponding optimisation problems are convex and hence have no local minima. This fact, and the reduced number of non-zero parameters, mark a clear distinction between these system and other pattern recognition algorithms, such as neural networks. This chapter will also describe the optimisation required to implement the Bayesian learning strategy using Gaussian processes.

Support Vector Classification

The aim of Support Vector classification is to devise a computationally efficient way of learning ‘good’ separating hyperplanes in a high dimensional feature space, where by ‘good’ hyperplanes we will understand ones optimising the generalisation bounds described in Chapter 4, and by ‘computationally efficient’ we will mean algorithms able to deal with sample sizes of the order of 100000 instances.

Type: Chapter
Information: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , pp. 93 - 124

DOI: https://doi.org/10.1017/CBO9780511801389.008 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2000

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

6 - Support Vector Machines

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive