On selecting regression variables to maximize their significance

Daniel McFadden

doi:10.1017/CBO9781139175203.011

9 - On selecting regression variables to maximize their significance

Published online by Cambridge University Press: 05 June 2012

Edited by

Kimio Morimune and

Daniel McFadden: Affiliation:
University of California, Berkeley
Cheng Hsiao: Affiliation:
University of Southern California
Kimio Morimune: Affiliation:
Kyoto University, Japan
James L. Powell: Affiliation:
University of California, Berkeley

Book contents

Get access

Summary

Introduction

Often in applied linear regression analysis one must select an explanatory variable from a set of candidates. For example, in estimating production functions one must select among alternative measures of capital stock constructed using different depreciation assumptions. Or, in hedonic analysis of housing prices, one may use indicator or ramp variables that measure distance from spatial features such as parks or industrial plants, with cutoffs at distances that are determined as parameters. In the second example, the problem can be cast as one of nonlinear regression. However, when there are many linear parameters in the regression, direct nonlinear regression can be computationally inefficient, with convergence problematic. It is often more practical to approach this as a linear regression problem with variable selection.

This chapter shows that selecting variables in a linear regression to maximize their conventional significance is equivalent to direct application of nonlinear least squares. Thus, this method provides a practical computational shortcut that shares the statistical properties of the nonlinear least-squares solution. However, standard errors and test statistics produced by least squares are biased by variable selection, and are often inconsistent. This chapter gives practical consistent estimators for covariances and test statistics, and shows in examples that kernelsmoothing or bootstrap methods appear to give adequate approximations in samples of moderate size.

Type: Chapter
Information: Nonlinear Statistical Modeling
Proceedings of the Thirteenth International Symposium in Economic Theory and Econometrics: Essays in Honor of Takeshi Amemiya
, pp. 259 - 280

DOI: https://doi.org/10.1017/CBO9781139175203.011 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2001

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

9 - On selecting regression variables to maximize their significance

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive