Skip to main content Accessibility help

Cluster–Robust Variance Estimation for Dyadic Data

  • Peter M. Aronow (a1), Cyrus Samii (a2) and Valentina A. Assenova (a3)


Dyadic data are common in the social sciences, although inference for such settings involves accounting for a complex clustering structure. Many analyses in the social sciences fail to account for the fact that multiple dyads share a member, and that errors are thus likely correlated across these dyads. We propose a non-parametric, sandwich-type robust variance estimator for linear regression to account for such clustering in dyadic data. We enumerate conditions for estimator consistency. We also extend our results to repeated and weighted observations, including directed dyads and longitudinal data, and provide an implementation for generalized linear models such as logistic regression. We examine empirical performance with simulations and an application to interstate disputes.


Corresponding author

e-mail: (corresponding author)


Hide All

Authors' note: The authors thank Neal Beck, Allison Carnegie, Dean Eckles, Donald Lee, Winston Lin, Kelly Rader, Olav Sorenson, the Political Analysis editors, and two reviewers for helpful comments. They thank Jonathan Baron and Lauren Pinson for research assistance. Supplementary materials for this article are available on the Political Analysis Web site. Replication materials are available on the Political Analysis Dataverse (



Hide All
Angrist, Joshua D., and Imbens, Guido W. 2002. Comment on “Covariance adjustment in randomized experiments and observational studies” by Paul R. Rosenbaum. Statistical Science 17(3): 304–7.
Angrist, Joshua D., and Pischke, Jorn-Steffen. 2009. Mostly harmless econometrics: An empiricist's companion. Princeton, NJ: Princeton University Press.
Arellano, Manuel. 1987. Computing robust standard errors for within-group estimators. Oxford Bulletin of Economics and Statistics 49(4): 431–34.
Beck, Nathanial, Skrede Gleditsch, Kristian, and Beardsley, Kyle. 2006. Space is more than geography: Using spatial ecometrics in the study of political economy. International Studies Quarterly 50:2744.
Andreas, Buja, Berk, Richard Brown, Lawrence George, Edward Pitkin, Emil Traskin, Mikhail Zhao, Linda and Zhang, Kai. 2014. Models as approximations: A conspiracy of random predictors and model violations against classical inference in regression. Manuscript, Wharton School, University of Pennsylvania, Philadelphia.
Cameron, A. Colin, Gelbach, Jonah B., and Miller, Douglas L. 2011. Robust inference with multi-way clustering. Journal of Business and Economic Statistics 29(2): 238–49.
Chamberlain, Gary. 1982. Multivariate regression models for panel data. Journal of Econometrics 18(1): 546.
Conley, Timothy G. 1999. GMM estimation with cross-sectional dependence. Journal of Econometrics 92:145.
Davidson, Russell, and MacKinnon, James G. 2004. Econometric theory and methods. Oxford: Oxford University Press.
Erikson, R. S., Pinto, P. M., and Rader, K. T. 2014. Dyadic analysis in international relations: A cautionary tale. Political Analysis 22(4): 457–63.
Fafchamps, Marcel, and Gubert, Flore. 2007. The formation of risk sharing networks. Journal of Development Economics 83:326–50.
Fisman, Raymond, Iyengar, Sheena S., Kamenica, Emik, and Simonson, Itamar. 2006. Gender differences in mate selection: Evidence from a speed dating experiment. Quarterly Journal of Economics 121:673–97.
Gelman, Andrew, and Hill, Jennifer. 2007. Data analysis using regression and multilevel/hierarchical models. Cambridge: Cambridge University Press.
Goldberger, Arthur S. 1991. A course in econometrics. Cambridge, MA: Harvard University Press.
Green, Donald P., Yeon Kim, Soo, and Yoon, David H. 2001. Dirty pool. International Organization 55(2): 441–68.
Greene, William H. 2008. Econometric analysis. 6th ed. Upper Saddle River, NJ: Pearson.
Hansen, Christian B. 2007. Asymptotic properties of a robust variance matrix estimator for panel data when T is large. Journal of Econometrics 141:597620.
Hoff, Peter D. 2005. Bilinear mixed-effects models for dyadic data. Journal of the American Statistical Association 100(469): 286–95.
Hubbard, Alan E., Ahern, Jennifer, Fliescher, Nancy L., Van Der Laan, Mark, Lippman, Sheri A., Jewell, Tim Bruckner, Nicholas, and Satariano, William A. 2010. To GEE or not to GEE: Comparing population average and mixed models for estimating the associations between neighborhood risk factors and health. Epidemiology 21(4): 467–74.
Huber, Peter J. 1967. The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Vol. 1, pp. 221–33. Berkeley, CA: University of California Press.
Kenny, David A., Kashy, Deborah A., and Cook, William L. 2006. Dyadic data analysis. New York: Guilford Press.
King, Gary, and Roberts, Margaret E. 2015. “How Robust Standard Errors Expose Methodological Problems They Do Not Fix, and What to Do About It.” Political Analysis 23(2): 159–79.
Lehmann, Erich L. 1999. Elements of large sample theory. New York: Springer-Verlag.
Liang, Kung-Yee, and Zeger, Scott L. 1986. Longitudinal data analysis using generalized linear models. Biometrika 73(1): 1322.
Lin, Winston. 2013. Agnostic notes on regression adjustments to experimental data: Reexamining Freedman's critique. Annals of Applied Statistics 7(1): 295318.
MacKinnon, James G., and White, Halbert. 1985. Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties. Journal of Econometrics 29(3): 305–25.
Moulton, Brent R. 1986. Random group effects and the precision of regression estimates. Journal of Econometrics 32:385–97.
Neumayer, Eric, and Pluemper, Thomas. 2010. Spatial effects in dyadic data. International Organization 64(1): 145–65.
Russett, Bruce M., and Oneal, John R. 2001. Triangulating peace: Democracy, interdependence, and international organizations. New York: Norton.
Samii, Cyrus. 2015. Cluster-Robust Variance Estimation for Dyadic Data., Harvard Dataverse, V1 [UNF:6:WJJ3ZmDS7COvpy1kwztcMQ==].
Stefanski, Leonard A., and Boos, Dennis D. 2002. The calculus of M-estimation. American Statistician 56(1): 2938.
Stock, James H., and Watson, Mark W. 2008. Heteroskedasticity-robust standard errors for fixed effects panel data regression. Econometrica 76(1): 155–74.
White, Halbert. 1980a. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroske-dasticity. Econometrica 48(4): 817–38.
White, Halbert. 1980b. Using least squares to approximate unknown regression functions. International Economic Review 21(1): 149–70.
White, Halbert. 1981. Consequences and detection of misspecified nonlinear regression models. Journal of the American Statistical Association 76(374): 419–33.
White, Halbert. 1982. Maximum likelihood estimation of misspecified models. Econometrica 50:125.
White, Halbert. 1984. Asymptotic theory of econometricians. New York: Academic Press.
Wooldridge, Jeffrey M. 2010. Econometric analysis of cross section and panel data. Cambridge, MA: MIT Press.
Zorn, Christopher. 2001. Generalized estimating equation models for correlated data: A review with applications. American Journal of Political Science 45:470–90.
MathJax is a JavaScript display engine for mathematics. For more information see
Type Description Title
Supplementary materials

Aronow et al. supplementary material
Supporting Information

 PDF (217 KB)
217 KB

Cluster–Robust Variance Estimation for Dyadic Data

  • Peter M. Aronow (a1), Cyrus Samii (a2) and Valentina A. Assenova (a3)


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed