✓ To comprehend the concept, types and working of classification
✓ To identify the major differences between classification and regression problems
✓ To become familiar about the working of classification
✓ To introduce the decision tree classification system with concepts of information gain and Gini Index
✓ To understand the workings of the Naïve Bayes method
Introduction to Classification
Nowadays databases are used for making intelligent decisions. Two forms of data analysis namely classification and regression are used for predicting future trends by analyzing existing data. Classification models predict discrete value or class, while Regression models predict a continuous value. For example, a classification model can be built to predict whether India will win a cricket match or not, while regression can be used to predict the runs that will be scored by India in a forthcoming cricket match.
Classification is a classical method which is used by machine learning researchers and statisticians for predicting the outcome of unknown samples. It is used for categorization of objects (or things) into given discrete number of classes. Classification problems can be of two types, either binary or multiclass. In binary classification the target attribute can only have two possible values. For example, a tumor is either cancerous or not, a team will either win or lose, a sentiment of a sentence is either positive or negative and so on. In multiclass classification, the target attribute can have more than two values. For example, a tumor can be of type 1, type 2 or type 3 cancer; the sentiment of a sentence can be happy, sad, angry or of love; news stories can be classified as weather, finance, entertainment or sports news.
Some examples of business situations where the classification technique is applied are:
In first example, the system will predict a discrete value representing either risky or safe, while in second example, the system will predict yes or no.
Some more examples to distinguish the concept of regression from classification are: