Skip to main content Accessibility help

Corrected Standard Errors with Clustered Data

  • John E. Jackson (a1)


The use of cluster robust standard errors (CRSE) is common as data are often collected from units, such as cities, states or countries, with multiple observations per unit. There is considerable discussion of how best to estimate standard errors and confidence intervals when using CRSE (Harden 2011; Imbens and Kolesár 2016; MacKinnon and Webb 2017; Esarey and Menger 2019). Extensive simulations in this literature and here show that CRSE seriously underestimate coefficient standard errors and their associated confidence intervals, particularly with a small number of clusters and when there is little within cluster variation in the explanatory variables. These same simulations show that a method developed here provides more reliable estimates of coefficient standard errors. They underestimate confidence intervals for tests of individual and sets of coefficients in extreme conditions, but by far less than do CRSE. Simulations also show that this method produces more accurate standard error and confidence interval estimates than bootstrapping, which is often recommended as an alternative to CRSE.


Corresponding author


Hide All

Author’s note: I want to thank Ken Kollman, Chuck Shipan, Matthew Webb and the ubiquitous anonymous referee for their helpful comments and Diogo Ferrari for his comments and the R package “ceser” for computing CESE. All are absolved from any and all errors.

Contributing Editor: Jeff Gill



Hide All
Angrist, J. D., and Pischke, J.-S.. 2009. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton, NJ: Princeton University Press.
Bell, R. M., and McCaffery, D. F.. 2002. “Bias Reduction and Standard Errors for Linear Regression with Multi-Stage Samples.” Survey Methodology 26(2):169181.
Bormann, N.-C., and Golder, M.. 2013. “Democratic Electoral Systems Around the World, 1946–2011.” Electoral Studies 32:360369.
Brown, R. D., Jackson, R. A., and Wright, G. C.. 1999. “Registration, Turnout, and State Party Systems.” Political Research Quarterly 52(3):463479.
Cameron, C. A., Gelbach, J. B., and Miller, D. L.. 2008. “Bootstrap-Based Improvements for Inference with Clustered Errors.” Review of Economics and Statistics 90(3):414427.
Davidson, R., and MacKinnon, J. G.. 1993. Estimation and Inference in Econometrics. New York, NY: Oxford University Press.
Eicker, F. 1967. “Limit Theorems for Regressions with Unequal and Dependent Errors.” In Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability, edited by Le Cam, L. M. and Heyman, J., 5982. Berkeley, CA: California University Press.
Elgie, R., Bueur, C., Dolez, B., and Laurent, A.. 2014. “Proximity, Candidates, and Presidential Power: How Directly Elected Presidents Shape the Legislative Party System.” Political Research Quarterly 67(3):467477.
Esarey, J., and Menger, A.. 2019. “Practical and Effective Approaches to Dealing with Clustered Data.” Political Science Research and Methods 7(3):541559.
Franzese, R. J. Jr.. “Empirical Strategies for Various Manifestations of Multilevel Data.” Political Analysis 13(4):430446.
Golder, M. 2005. “Democratic Electoral Systems Around the World, 1946–2000.” Electoral Studies 24:103121.
Golder, M. 2006. “Presidential Coattails and Legislative Fragmentation.” American Journal of Political Science 50(1):3448.
Greene, W. H. 2012. Econometric Analysis. Upper Saddle River, NJ: Prentice-Hall.
Harden, J. J. 2011. “A Bootstrap Method for Conducting Statistical Inference with Clustered Data.” State Politics and Policy Quarterly 11(2):223246.
Hicken, A., and Stoll, H.. 2012. “Are All Presidents Created Equal? Presidential Powers and the Shadow of Presidential Elections.” Comparative Political Studies 46(3):291319.
Huber, P. J. 1967. “The Behavior of Maximum Likelihood Estimates Under Nonstandard Conditions.” In Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability, edited by Le Cam, L. M. and Heyman, J., 221223. Berkeley, CA: California University Press.
Ibragimov, R., and Muller, U. K.. 2002. “t-Statistic Based Correlation and Heterogeneity Robust Inference.” Journal of Business and Economic Statistics 28(4):453468.
Imbens, G. W., and Kolesár, M.. 2016. “Robust Standard Errors in Small Samples: Some Practical Advice.” The Review of Economics and Statistics 98(4):701712.
Jackson, J. E.2019 “Replication Data for: Corrected Standard Errors with Clustered Data.”, Harvard Dataverse, V1.
Liang, K.-Y., and Zeger, S. L.. 1986. “Longitudinal Data Analysis for Generalized Linear Models.” Biometrika 73:1322.
Long, J. S., and Ervin, L. H.. 2000. “Using Heteroscedasticity Consistent Standard Errors in the Linear Regression Model.” The American Statistician 54(3):217224.
MacKinnon, J. G., and Webb, M. D.. 2017. “Wild Bootstrap Inference for Wildly Different Cluster Sizes.” Journal of Applied Econometrics 32(2):233254.
Roodman, D., Nielsen, M. Ø., MacKinnon, J. G., and Webb, M. D.. 2019. “Fast and Wild: Bootstrap Inference in Stata Using Boottest.” The Stata Journal 19(1):460.
Wasserstein, R. L., Schirm, A. L., and Lazar, N. A.. 2019. “Moving to a World Beyond ‘$p<0.05$.” The American Statistician 73(1):119.
White, H. 1980. “A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity.” Econometrica 48:817838.
MathJax is a JavaScript display engine for mathematics. For more information see


Type Description Title
Supplementary materials

Jackson supplementary material
Jackson supplementary material

 Unknown (215 KB)
215 KB

Corrected Standard Errors with Clustered Data

  • John E. Jackson (a1)


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.