Chapter Preview. Linear modeling, also known as regression analysis, is a core tool in statistical practice for data analysis, prediction, and decision support. Applied data analysis requires judgment, domain knowledge, and the ability to analyze data. This chapter provides a summary of the linear model and discusses model assumptions, parameter estimation, variable selection, and model validation around a series of examples. These examples are grounded in data to help relate the theory to practice. All of these practical examples and exercises are completed using the open-source R statistical computing package. Particular attention is paid to the role of exploratory data analysis in the iterative process of criticizing, improving, and validating models in a detailed case study. Linear models provide a foundation for many of the more advanced statistical and machine-learning techniques that are explored in the later chapters of this volume.
Linear models are used to analyze relationships among various pieces of information to arrive at insights or to make predictions. These models are referred to by many terms, including linear regression, regression, multiple regression, and ordinary least squares. In this chapter we adopt the term linear model.
Linear models provide a vehicle for quantifying relationships between an outcome (also referred to as dependent or target) variable and one or more explanatory (also referred to as independent or predictive) variables.