Skip to main content Accessibility help
×
Hostname: page-component-8448b6f56d-42gr6 Total loading time: 0 Render date: 2024-04-24T16:16:18.903Z Has data issue: false hasContentIssue false

5 - Classification

Published online by Cambridge University Press:  26 April 2019

Parteek Bhatia
Affiliation:
Thapar University, India
Get access

Summary

Chapter Objectives

✓ To comprehend the concept, types and working of classification

✓ To identify the major differences between classification and regression problems

✓ To become familiar about the working of classification

✓ To introduce the decision tree classification system with concepts of information gain and Gini Index

✓ To understand the workings of the Naïve Bayes method

Introduction to Classification

Nowadays databases are used for making intelligent decisions. Two forms of data analysis namely classification and regression are used for predicting future trends by analyzing existing data. Classification models predict discrete value or class, while Regression models predict a continuous value. For example, a classification model can be built to predict whether India will win a cricket match or not, while regression can be used to predict the runs that will be scored by India in a forthcoming cricket match.

Classification is a classical method which is used by machine learning researchers and statisticians for predicting the outcome of unknown samples. It is used for categorization of objects (or things) into given discrete number of classes. Classification problems can be of two types, either binary or multiclass. In binary classification the target attribute can only have two possible values. For example, a tumor is either cancerous or not, a team will either win or lose, a sentiment of a sentence is either positive or negative and so on. In multiclass classification, the target attribute can have more than two values. For example, a tumor can be of type 1, type 2 or type 3 cancer; the sentiment of a sentence can be happy, sad, angry or of love; news stories can be classified as weather, finance, entertainment or sports news.

Some examples of business situations where the classification technique is applied are:

  • • To analyze the credit history of bank customers to identify if it would be risky or safe to grant them loans.

  • • To analyze the purchase history of a shopping mall's customers to predict whether they will buy a certain product or not.

  • In first example, the system will predict a discrete value representing either risky or safe, while in second example, the system will predict yes or no.

    Some more examples to distinguish the concept of regression from classification are:

  • • To predict how much a given customer will spend during a sale.

  • Type
    Chapter
    Information
    Data Mining and Data Warehousing
    Principles and Practical Techniques
    , pp. 65 - 127
    Publisher: Cambridge University Press
    Print publication year: 2019

    Access options

    Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

    Save book to Kindle

    To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

    Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

    Find out more about the Kindle Personal Document Service.

    • Classification
    • Parteek Bhatia
    • Book: Data Mining and Data Warehousing
    • Online publication: 26 April 2019
    • Chapter DOI: https://doi.org/10.1017/9781108635592.006
    Available formats
    ×

    Save book to Dropbox

    To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

    • Classification
    • Parteek Bhatia
    • Book: Data Mining and Data Warehousing
    • Online publication: 26 April 2019
    • Chapter DOI: https://doi.org/10.1017/9781108635592.006
    Available formats
    ×

    Save book to Google Drive

    To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

    • Classification
    • Parteek Bhatia
    • Book: Data Mining and Data Warehousing
    • Online publication: 26 April 2019
    • Chapter DOI: https://doi.org/10.1017/9781108635592.006
    Available formats
    ×