Skip to main content Accessibility help
×
Hostname: page-component-76fb5796d-25wd4 Total loading time: 0 Render date: 2024-04-25T10:39:41.697Z Has data issue: false hasContentIssue false

7 - Cluster Analysis

Published online by Cambridge University Press:  26 April 2019

Parteek Bhatia
Affiliation:
Thapar University, India
Get access

Summary

Chapter Objectives

✓ To comprehend the concept of clustering, its applications, and features.

✓ To understand various distance metrics for clustering of data.

✓ To comprehend the process of K-means clustering.

✓ To comprehend the process of hierarchical clustering algorithms.

✓ To comprehend the process of DBSCAN algorithms.

Introduction to Cluster Analysis

Generally, in the case of large datasets, data is not labeled because labeling a large number of records requires a great deal of human effort. The unlabeled data can be analyzed with the help of clustering techniques. Clustering is an unsupervised learning technique which does not require a labeled dataset.

Clustering is defined as grouping a set of similar objects into classes or clusters. In other words, during cluster analysis, the data is grouped into classes or clusters, so that records within a cluster (intra-cluster) have high similarity with one another but have high dissimilarities in comparison to objects in other clusters (inter-cluster), as shown in Figure 7.1.

The similarity of records is identified on the basis of values of attributes describing the objects. Cluster analysis is an important human activity. The first human beings Adam and Eve actually learned through the process of clustering. They did not know the name of any object, they simply observed each and every object. Based on the similarity of their properties, they identified these objects in groups or clusters. For example, one group or cluster was named as trees, another as fruits and so on. They further classified the fruits on the basis of their properties like size, colour, shape, taste, and others. After that, people assigned labels or names to these objects calling them mango, banana, orange, and so on. And finally, all objects were labeled. Thus, we can say that the first human beings used clustering for their learning and they made clusters or groups of physical objects based on the similarity of their attributes.

Applications of Cluster Analysis

Cluster analysis has been widely used in various important applications such as:

  • • Marketing: It helps marketers find out distinctive groups among their customer bases, and this knowledge helps them improve their targeted marketing programs.

  • • Land use: Clustering is used for identifying areas of similar land use from the databases of earth observations.

  • • Insurance: Clustering is helpful for recognizing clusters of insurance policyholders with a high regular claim cost.

  • Type
    Chapter
    Information
    Data Mining and Data Warehousing
    Principles and Practical Techniques
    , pp. 155 - 205
    Publisher: Cambridge University Press
    Print publication year: 2019

    Access options

    Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

    Save book to Kindle

    To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

    Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

    Find out more about the Kindle Personal Document Service.

    • Cluster Analysis
    • Parteek Bhatia
    • Book: Data Mining and Data Warehousing
    • Online publication: 26 April 2019
    • Chapter DOI: https://doi.org/10.1017/9781108635592.008
    Available formats
    ×

    Save book to Dropbox

    To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

    • Cluster Analysis
    • Parteek Bhatia
    • Book: Data Mining and Data Warehousing
    • Online publication: 26 April 2019
    • Chapter DOI: https://doi.org/10.1017/9781108635592.008
    Available formats
    ×

    Save book to Google Drive

    To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

    • Cluster Analysis
    • Parteek Bhatia
    • Book: Data Mining and Data Warehousing
    • Online publication: 26 April 2019
    • Chapter DOI: https://doi.org/10.1017/9781108635592.008
    Available formats
    ×