Skip to main content Accessibility help
×
Home
Hostname: page-component-99c86f546-zzcdp Total loading time: 0.281 Render date: 2021-12-07T13:55:48.319Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "metricsAbstractViews": false, "figures": true, "newCiteModal": false, "newCitedByModal": true, "newEcommerce": true, "newUsageEvents": true }

6 - Kernel methods for cluster analysis

from Part III - Unsupervised learning models for cluster analysis

Published online by Cambridge University Press:  05 July 2014

S. Y. Kung
Affiliation:
Princeton University, New Jersey
Get access

Summary

Introduction

The various types of raw data encountered in real-world applications fall into two main categories, vectorial and nonvectorial types. For vectorial data, the Euclidean distance or inner product is often used as the similarity measure of the training vectors: (xi, i = 1,…, N}. This leads to the conventional K-means or SOM clustering methods. This chapter extends these methods to kernel-based cluster discovery and then to nonvectorial clustering applications, such as sequence analysis (e.g. protein sequences and signal motifs) and graph partition problems (e.g. molecular interactions, social networks). The fundamental unsupervised learning theory will be systematically extended to nonvectorial data analysis.

This chapter will cover the following kernel-based unsupervised learning models for cluster discovery.

  • Section 6.2 explores kernel K-means in intrinsic space. In this basic kernel K-means learning model, the original vectors are first mapped to the basis functions for the intrinsic vector space H, and the mapped vectors will then be partitioned into clusters by the conventional K-means. Because the intrinsic-space approach will not be implementable for some vectorial and all nonvectorial applications, alternative representations need to be pursued. According to Theorem 1.1, the LSP condition holds for K-means. According to Eq. (1.20), this means that the problem formulation may be fully and uniquely characterized by the kernel matrix K associated with the training vector, i.e. without a specific vector space being explicitly defined. In short, the original vector-based clustering criterion is converted to a vector-free clustering criterion.

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Send book to Kindle

To send this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle.

Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Send book to Dropbox

To send content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about sending content to Dropbox.

Available formats
×

Send book to Google Drive

To send content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about sending content to Google Drive.

Available formats
×