Chapter Preview. This chapter presents regression models where the dependent variable is categorical, whereas covariates can either be categorical or continuous. In the first part binary dependent variable models are presented, and the second part is aimed at covering general categorical dependent variable models, where the dependent variable has more than two outcomes. This chapter is illustrated with datasets, inspired by real-life situations. It also provides the corresponding R programs for estimation, which are based on R packages glm and mlogit. The same output can be obtained when using SAS or similar software programs for estimating the models presented in this chapter.
Coding Categorical Variables
Categorical variables measure qualitative traits; in other words, they evaluate concepts that can be expressed in words. Table 3.1 presents examples of variables that are measured in a categorical scale and are often found in insurance companies databases. These variables are also called risk factors when they denote characteristics that are associated with losses.
Categorical variables must have mutually exclusive outcomes. The number of categories is the number of possible response levels. For example, if we focus on insurance policies, we can have a variable such as TYPE OF POLICY CHOSEN with as many categories as the number of possible choices for the contracts offered to the customer.