Another Game with Nature: An Ecological Regression Model of the British Two-Party Vote Ratio in 1970

Ivor Crewe; Clive Payne

doi:10.1017/S0007123400000521

Another Game with Nature: An Ecological Regression Model of the British Two-Party Vote Ratio in 1970

Published online by Cambridge University Press: 27 January 2009

Ivor Crewe and

Clive Payne

Article contents

Extract
References

Get access

Rights & Permissions

Extract

This article develops a number of themes first raised in an earlier paper where we attempted to publicize the existence of Census data based, for the first time, on British parliamentary constituencies, and where we briefly described the potential and limits of a variety of available statistical techniques of analysis. Until the earlier paper was published, studies of British electoral behaviour using aggregate data were largely historical, generally used only the simplest statistical techniques such as cross-tabulations, and usually proceeded blithely unaware of the snares of ecological inference. A small number of more advanced analyses had appeared but none focused on Britain or even on England as a whole. Since our earlier article appeared, there have been two attempts to construct predictive models of Labour support by applying multivariate statistical analysis to aggregate-level data. As we show in this paper, both Barnett and Rasmussen produce models that are statistically less powerful than our own and are subject to various weaknesses, of which the most important is the failure to tackle the problem of ecological inference.

Type: Articles
Information: British Journal of Political Science , Volume 6 , Issue 1 , January 1976 , pp. 43 - 81

DOI: https://doi.org/10.1017/S0007123400000521 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1976

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

¹ Crewe, Ivor and Payne, Clive, ‘Analysing the Census Data’, in Butler, David and Pinto-Duschinsky, Michael, The British General Election of 1970 (London: Macmillan, 1971), pp. 416–36.Google Scholar

² See Felling, Henry, The Social Geography of British Elections 1885–1910 (London: Macmillan, 1967)CrossRef Google Scholar; Kinnear, Michael, The British Voter: An Atlas and Survey since 1885 (London: Batsford, 1968)Google Scholar; Cornford, James, ‘Aggregate Election Data and British Party Alignments 1885–1910’, in Allardt, Erik and Rokkan, Stein, eds., Mass Politics (New York and London: The Free Press, 1970)Google Scholar; Nossiter, T. J., ‘Aspects of Electoral Behaviour in English Constituencies 1832–1868’Google Scholar, in Allardt, and Rokkan, , Mass PoliticsGoogle Scholar. All four authors imply rather than assert causal connections; but none seriously attempts to check against the possibility that ecological fallacy has been committed. For less ambitious but equally unsophisticated analyses which locate but do not explain variations in the Labour vote after controlling for social class only, see Piepe, Anthony, Prior, Robin and Box, Arthur, ‘The Location of the Proletarian and Deferential Worker’, Sociology, 111 (1969), 239–44CrossRef Google Scholar; and George, Wilma, ‘Social Conditions and the Labour Vote in the County Boroughs of England and Wales’, British Journal of Sociology, II (1951), 255–9.CrossRef Google Scholar

Two further studies which examine the relationship between Labour support and one aggregate variable only, size of place, are Bealey, Frank and Dyer, Michael, ‘Size of Place and the Labour Vote in Britain 1918–1966’, Western Political Quarterly, xxiv (1971), 84–113Google Scholar; and Grant, W. P., ‘Size of Place and Local Labour Strength’, British Journal of Political Science, 11 (1972), 259–60.CrossRef Google Scholar

³ See Moser, C. and Scott, W., British Towns (London: Oliver and Boyd, 1961)Google Scholar, which pioneered the use of correlation and principal components analysis on British Census data; Cox, Kevin, ‘Geography, Social Contexts and Voting Behaviour in Wales’Google Scholar, in Allardt, and Rokkan, , Mass PoliticsGoogle Scholar, in which factor analysis is applied to a small base of seventeen boroughs and counties; and by the same author, ‘Voting in the London Suburbs: a Factor Analysis and a Causal Model’, in Dogan, M. and Rokkan, S., eds., Quantitative Ecological Analysis in the Social Sciences (Cambridge, Mass.: MIT Press, 1969)Google Scholar, which explores the independent effect of suburban residence on partisanship and turnout. See also Crewe, Ivor, ‘The Politics of “Affluent” and “Traditional” Workers in Britian: an Aggregate Data Analysis’, British Journal of Political Science, 111 (1973), 29–52CrossRef Google Scholar, which uses analysis of residuals to test Goldthorpe et al.'s theories about the Labour partisanship of ‘affluent’ and ‘traditional’ workers.

⁴ Rasmussen, Jorgen, ‘The Impact of Constituency Structural Characteristics upon Political Preferences in Britain’, Comparative Politics, v (1973), 123–45CrossRef Google Scholar; Barnett, Malcolm J., ‘Aggregate Models of British Voting Behaviour’, Political Studies, xxi (1973), 121–34.CrossRef Google Scholar

⁵ For examples from Rasmussen, see fn. 29.

⁶ For example, the ‘West Midlands’ group contained a number of constituencies with unusually high proportions both of coloured immigrants and of skilled workers. The group was so labelled because a large minority of its constituencies did not contain abnormally large proportions of immigrants or of skilled manual workers, whereas other constituencies outside the West Midlands which did possess those characteristics nevertheless had low residuals. Thus the form of our checking was thorough, but statistically crude for all that.

⁷ General Register Office, Census 1966: United Kingdom General and Parliamentary Constituency Tables (London: HMSO, 1969)Google Scholar. The data used for the analysis presented in this paper are held by the SSRC Survey Archive at the University of Essex.

⁸ A linear regression model may be expressed as

where L_t, is the value of the dependent variable (percent Labour vote) for the ith of the n units (parliamentary constituencies). Z_1t… Z_vt, are the actual (observed) values of the predictor variables 1, 2,…, u for the ith unit. B_o, B₁,…, B_u are the parameters (or regression coefficients) of the model and e_t, is the error (or disturbance) term. The parameters B_c, B₁…, B_u are estimated from data on all units included in the analysis in such a way that the observed value L_t is reproduced with the minimum of error by the criterion of ‘least squares’. The regression coefficients B₁…, B_u reveal the degree of change in the dependent variable for each unit of change in the predictor variables Z₁…, Z_u given that the other predictor variables are in the model (B_o is a constant). They are only direct measures of the effect of the predictors Z₁,…, Z_u, however, if the model is correctly specified and the predictor variables are independent of each other. The model is linear only in regard to the parameters. The predictor variables Z₁,…, Z_u may be any transformation or function of the original variables in the analysis such as a quadratic transformation or factor scores derived from a number of variables.

⁹ Robinson, W. S., ‘Ecological Correlations and the Behavior of Individuals’, American Sociological Review, xv (1950), 351–7.CrossRef Google Scholar

¹⁰ Stokes, Donald E., ‘Cross-Level Inference as a Game with Nature’, in Bernd, J. L., ed., Mathematical Applications in Political Science IV (Charlottesville: University of Virginia Press, 1969), Pp. 62–83.Google Scholar

¹¹ See Butler, David and Stokes, Donald, Political Change in Britain (London: Macmillan, 1969), Chaps. 4–6Google Scholar; Pulzer, P. G. J., Political Representation and Elections in Britain (London: Allen and Unwin, 1967)Google Scholar, Chap. 4 (‘class is the basis of British party politics; all else is embellishment and detail’ (p. 98))Google Scholar; Runciman, W. G., Relative Deprivation and Social Justice (London: Routledge and Kegan Paul, 1967), p. 56Google Scholar, on the overriding importance of the manual/non-manual division; and Alford, R. R., Party and Society (London: John Murray, 1964)Google Scholar for international comparisons of class voting in which the British rate emerges as the highest.

¹² Particular reference should be made to Duncan, Otis Dudley and Davis, Beverley, ‘An Alternative to Ecological Correlation’, American Sociological Review, xviii (1953), 665–6CrossRef Google Scholar, where it is pointed out that it is always possible to calculate the range of values for each cell compatible with the marginals. But this solution is impractical because the range of possible cell entries will usually be too wide to be of any value.

¹³ Goodman, Leo A., ‘Some Alternatives to Ecological Correlation’, American Journal of Sociology, LXIV (1959), 610–15.CrossRef Google Scholar

¹⁴ This paragraph is indebted to the very clear account found in Shively, W. Phillips, ‘Ecological Inference: The Use of Aggregate Data to Study Individuals’, American Political Science Review, LXIII (1969), 1183–96, p. 1187.Google Scholar

¹⁵ Butler, and Stokes, , Political Change in Britain, pp. 140–50.Google Scholar

¹⁶ Butler, and Stokes, , Political Change in Britain, pp. 303–12.Google Scholar

¹⁷ For example, Stokes in ‘Cross-Level Inference’ suggests that the ecological correlation between L and M can be interpreted as a measure of the extent to which row and column marginals are compatible with the assumption that the cell values are uniform across all constituencies. He also proposes the derivation of subsets of constituencies for which the ecological correlation is particularly strong from the scatterplot of L and M. Cell values could then be estimated in the knowledge that within each subset the constancy of cell values was highly probable. Shively, (‘Ecological Inference’, pp. 1189–90)Google Scholar proves that to be unbiased estimates the cell values must be constant for aggregate units with different values on the independent variable (in our case, M). To the extent that the aggregate units are grouped accordingly to their value on the dependent variable (in our case, L) the regression estimates of cell values will be biased. Shively specifies conditions under which this bias may be reduced to tolerable levels (see ‘Ecological Inference’ pp. 1191–2).Google Scholar

¹⁸ Since the submission of this paper to the Journal, our attention has been drawn to Miller, William, Raab, Gillian and Britto, K., ‘Voting Research and the Population Census 1918–71: Data for Constituency Analysis’, Journal of the Royal Statistical Society, Series A, 137 (1974), 384–411CrossRef Google Scholar, which adopts a different solution to the same problem, but starts from a similar standpoint.

¹⁹ A slight recomputation of figures provided in Butler, and Stokes, , Political Change in Britain, pp. 140–1Google Scholar, shows that, although the proportion of manual and of non-manual workers voting Labour differs markedly between regions and between rural, urban, seaside and mining constituencies, the difference between the two proportions ranged only from 0·327 to 0·435.

²⁰ For a thorough discussion of negative and other inadmissable estimates of dependent proportions, see Cox, David, The Analysis of Binary Data (London: Methuen, 1970)Google Scholar; and Telser, Lester G., ‘Least Squares Estimates of Transition Probabilities’, in Christ, C. F. et al. , Measure ment in Economics (Stanford, Calif.: Stanford University Press, 1963), pp. 270–93.Google Scholar

²¹ See Miller, W. L., ‘Measures of Electoral Change Using Aggregate Data’, Journal of the Royal Statistical Society, Series A, 135 (1972), 122–42.CrossRef Google Scholar

²² On p. vii of the 1966 Sample Census it is calculated that ‘if the sampling figure for a particular category is less than a quarter of the whole sample, as it is in most cases, then the “standard error” of the sample figure is approximately its own square root. There is therefore a 2:1 chance that the error due to sampling is less than its square root, and a 20:1 chance that it is less than twice its square root’. But Censuses are known to underestimate the extent of immigration, overcrowding and poverty, and this could undoubtedly contribute to the error term.

²³ See, for example, Draper, N. R. and Smith, H., Applied Regression Analysis (New York: Wiley, 1966), pp. 50–5Google Scholar; Anscombe, F. J. and Tukey, J. W., ‘The Examinātion and Analysis of Residuals’, Technometrics, (1963), 141–60CrossRef Google Scholar; and Cox, D. R. and Snell, E. J., ‘A General Definition of Residuals’, Journal of the Royal Statistical Society, Series B, 30 (1968), 246–75.Google Scholar

²⁴ Cox, , Analysis of Binary Data, pp. 10–15Google Scholar has considered a variety of ways to transform dependent variables expressed as proportions so as to produce a constancy of variance in the error term. He points out, however, that for proportions in the range of 0·2 to 0·8 (as they are in our case) the variance is approximately constant.

²³ See Anscombe, F. J., ‘The Rejection of Outliers’, Technometrics, 11 (1960), 123–47.CrossRef Google Scholar

²⁶ The twelve constituencies of Northern Ireland, where the party system is so different; the Speaker's seat (Southampton Itchen), which was uncontested by an official Conservative; Greenock, which was fought by the Liberal and Labour parties only; and Merthyr Tydfil, in which the official Labour candidate was runner-up to the incumbent Independent Labour MP.

²⁷ Barnett, , ‘Aggregate Models’, p. 131.Google Scholar

²⁸ Barnett, , ‘Aggregate Models’, p. 130.Google Scholar

²⁹ Two must suffice. Like Barnett, Rasmussen found that ‘moderate housing density’ persistently correlated with the Labour vote across a variety of constituency groups, and he implies that therefore the residents of such housing are peculiarly prone to vote Labour. Statistically this is correct; but it is hardly likely that it is because their housing is of moderate density. Rasmussen is also impressed with the fact that in industrial constituencies, as opposed to any other type, social class II is negatively correlated with the Labour vote. From this fact about a section of the middle classes he takes an inferential leap of great length to conclude that the deferential workingclass Conservative is an industrial rather than agricultural phenomenon, despite established opinion and research to the contrary. He fails to consider the possibility that his finding merely reflects a particularly strong correlation between percent class I and percent class II in industrial constituencies (produced by the virtual absence of both classes in such constituencies).

³⁰ Crewe, and Payne, , ‘Analysing the Census Data’, pp. 435–6.Google Scholar

³¹ The various indicators of the variable ‘structure of party contest’ (the joint and separate effects of Liberal and Nationalist candidatures) are the exception.

³² Ridge regression is a procedure for estimating the coefficient in a linear model involving high intercorrelations. The estimates obtained are more precise than those derived by the ordinary least squares procedure although some bias is introduced into the estimates. Further details can be found in Miller, , ‘Measures of Electoral Change Using Aggregate Data’.Google Scholar

³³ Barnett presents three models, of which two explain 72 per cent of the variance with three predictors and one explains 75 per cent with five. But, as we have already argued, in all three cases the predictors are either substantively uninteresting, or are multiple indicators of the same variable, social class. Rasmussen constructs models which account for up to 92 per cent of variance, but for particular regions only and with a minimum of seven predictors.

³⁴ See Berelson, B., Lazarsfeld, P. and McPhee, W., Voting (Chicago: University of Chicago Press, 1954), pp. 98–101Google Scholar on the tendency of the apathetic and cross-pressured to ‘break’ disproportionately in favour of the Republicans in Republican-dominated Elmira; and Butler, and Stokes, , Political Change in Britain, pp. 303–12Google Scholar, on the contribution of homogenous, local, partisan cultures to the ‘paradox of the uniform swing’.

³⁵ Rasmussen, , ‘Impact of Constituency Structural Characteristics’Google Scholar, found that partial correlations between a predictor and the Labour vote varied more between regions than between urban and rural, or industrial and non-industrial constituencies. Butler, and Stokes, , Political Change in Britain, pp. 135–44Google Scholar, demonstrate that regional differences in party strength owe as much to regional variations in the two-party division among manual and non-manual voters as to differences in the class composition of the regions.

³⁶ See Berrington, Hugh and Bedeman, Trevor, ‘The February Election’, Parliamentary Affairs, xxvii (1973–1974), 317–32.Google Scholar

³⁷ Opinion poll data confirm that the rate of defection among Labour partisans considerably exceeded that of Conservative partisans in the 1970 election. See Teer, Frank and Spence, James O., Public Opinion Polls (London: Hutchinson, 1973), pp. 195–6.Google Scholar

³⁸ Berrington, and Bedeman, , ‘February Election’, p. 329.Google Scholar

³⁹ For two degrees of freedom the F ratio must exceed 3·8 (1·96)² for the regression coefficient to be significant at the 5 per cent level.

⁴⁰ This is confirmed by the values of Draper and Smith's statistics T₂₁ and T₁₂ for the examination of residuals. T₁₂ is the summed product of the residual and the square of the fitted value over all units. It indicated a slight departure from linearity which could be reduced by introducing extra variables or by using a quadratic transformation of some of the predictors included in the final model (T₁₂ = 2,397 compared with an expected value of zero). The non-linearity has been considerably reduced, however, from that present in the simple class/party model where T₁₂ = 10,683, and could be diminished yet further by refining the rather crude partisanship variable used in the model. In other words, this modest degree of non-linearity revealed probably indicates that the effect of the constituency's partisanship is more subtle than we have hypo thesized. T₂₁ is the summed product of the square of the residual and the fitted value over all units and gave no indication of non-constancy of variance (T₂1 = 1,169 compared with an expected value of 995 if variance was constant).

⁴¹ The North West region deviated conspicuously from the national average in the elections of 1959, 1970 and February 1974; the West Midlands deviated in the general elections of 1964, 1966, 1970 and February 1974. Its swing to Labour in February 1974 was the most deviant regional swing to take place since the war.

⁴² In a further six, heavily working-class, constituencies in 1966 (Huddersfield West, West Fife, Birmingham Ladywood, Colne Valley, West Lothian and the Western Isles) the Conservatives lost and failed to capture at least 60 per cent of the total ‘opposition’ vote (this was also true of the last four in 1970). The Liberal or Nationalist or both presumably drew more support from erstwhile Labour than erstwhile Conservative voters, thereby depressing Labour's share of the two-party vote. Strategic voting cannot therefore be the primary explanation of the high residuals in these cases.

⁴³ The statistic is the ‘two-party swing’, i.e. the average of the Conservative percentage gain and the Labour percentage loss since the previous election, based on their share of the total vote, and is confined to constituencies in which the two major parties came first and second in both elections. A positive sign denotes a swing to the Conservatives.

⁴⁴ Six seats held by-elections during the previous parliament, twice the figure one would obtain by chance, and it seemed plausible that the massive swings to the Conservatives recorded in the by-election would not have been completely eroded in the general election that followed. But the addition of ‘by-elections 1966–70’ to our model proved disappointing: its regression coefficient was never statistically significant.

⁴⁵ In a further four, largely middle-class, constituencies in 1966 (South Kensington, Finchley, St Ives and East Surrey) Labour lost and failed to capture at least 60 per cent of the total ‘opposition’ vote (this was also true of the latter two in 1970). The Liberals presumably drew more support from erstwhile Conservative than erstwhile Labour voters, thereby depressing the Conservative share of the two-party vote. Again, strategic voting cannot therefore be the primary explanation of the high residuals in these cases.

⁴⁶ In the case of all but one (Poplar) this could be attributed to the reduction of the Conservative vote to tiny proportions by the presence of strong support for minor parties.

⁴⁷ Carmarthen and Aberdeenshire West were omitted because of the distorting effects on the two-party vote produced by the strength of the Nationalist and Liberal parties respectively.

⁴⁸ Almost exactly the same amount of variance was explained (0·89) and the regression coefficients for each predictor variable was always within 3 per cent of the original final model, with the exception of Nationalist intervention which, as might be expected, substantially depressed Labour's share of the vote. There was a modest turnover in the fifty constituencies with the highest and the fifty with the lowest residuals, as a result of the magnitude of the residuals increasing in three- or four-party contests where the Labour or Conservative candidate fared badly (found almost entirely in rural Wales and rural Scotland); this was a natural reflection of our altered definition of the Labour vote.

⁴⁹ The proportion of variance explained was reduced to 0·83 and there were dramatic decreases in the size of some of the regression coefficients. These discrepancies not only reflected the impact of strong support for minor parties on Labour's share of the electorate, but the fact that in many seats where Labour had an overwhelming majority over the Conservatives, turnout tended to be so low that Labour's share of the electorate was unexpectedly modest.

⁵⁰ For an informative study of one such constituency, see Madgwick, P. J., with Griffiths, N. and Walker, V., The Politics of Rural Wales: a Study of Cardiganshire (London: Hutchinson, 1973).Google Scholar

⁵¹ Barnett and Rasmussen both use the Labour share of the total vote as their dependent variable and neither author controls for type of party contest or regards it as an independent variable. This is particularly unfortunate in Rasmussen's case, which is partly devoted to an examination of Britain's supposed national political homogeneity but which nowhere mentions the regional concentration of Liberal and of Nationalist candidates. Thus Rasmussen makes much of the fact that occupational variables and others relating to stratification explain far more of the variance of the Labour share of the poll in urban than in rural areas, without pointing out that the minor parties were far more likely to contest rural than urban constituencies in 1970. Barnett justifies his choice of dependent variable on the unsubstantiated grounds that ‘it seemed likely to be more sensitive to the range of socio-economic influences reported in the Census than the Conservative vote’ (Barnett, , ‘Aggregate Models’, p. 124)Google Scholar and also argues with some ambiguity that ‘abstention proved of no significance at all and was quickly abandoned’ (p. 125).Google Scholar We did not find this to be the case.

Article contents

Another Game with Nature: An Ecological Regression Model of the British Two-Party Vote Ratio in 1970

Extract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests