Two quite different statistical specifications lead to conventional regression computations. Under the first, usually called unconditional regression, the independent variables are assigned a distribution, and the sampling distributions of parameters are computed over it, not just over the variation in the disturbance. Then the multiple squared coefficient of correlation, R
2, is a substantively meaningful quantity with a population value. In such cases, it is certainly meaningful to estimate R
2 and to use the resulting estimate for the usual descriptive and inferential purposes.
The nature of most social scientific work, however, generates data poorly described by the first specification. The second specification, conditional regression, is usually more helpful, and for that reason, it overwhelmingly predominates in econometric textbooks. Under it, sampling distributions are conditioned on the observed values of the independent variables. Then, R
2 is a purely descriptive quantity with little substantive content. It is not a parameter, and it will vary meaninglessly across samples even when the underlying statistical law is unchanged. By contrast, the standard error of estimate (SEE) lacks these difficulties, and is a far better statistic for assessing goodness of fit.
Lewis-Beck and Skalaban are persuasive where unconditional regression is concerned. But their argument encounters serious difficulties in the far more common situation in which conditional regression applies. Their blurring of the distinction between the two situations explains why their seemingly persuasive logic leads them astray.