Skip to main content Accessibility help
Hostname: page-component-99c86f546-n7x5d Total loading time: 0.415 Render date: 2021-11-27T18:35:55.170Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "metricsAbstractViews": false, "figures": true, "newCiteModal": false, "newCitedByModal": true, "newEcommerce": true, "newUsageEvents": true }

Developing Standards for Post-Hoc Weighting in Population-Based Survey Experiments

Published online by Cambridge University Press:  12 October 2017

Annie Franco
Department of Political Science, Stanford University, Stanford, CA, USA, e-mail:
Neil Malhotra
Graduate School of Business, Stanford University, Stanford, CA, USA, e-mail:
Gabor Simonovits
Department of Politics, New York University, New York, NY, USA, e-mail:
L. J. Zigerell
Department of Politics and Government, Illinois State University, Normal, IL, USA, e-mail:


Weighting techniques are employed to generalize results from survey experiments to populations of theoretical and substantive interest. Although weighting is often viewed as a second-order methodological issue, these adjustment methods invoke untestable assumptions about the nature of sample selection and potential heterogeneity in the treatment effect. Therefore, although weighting is a useful technique in estimating population quantities, it can introduce bias and also be used as a researcher degree of freedom. We review survey experiments published in three major journals from 2000–2015 and find that there are no standard operating procedures for weighting survey experiments. We argue that all survey experiments should report the sample average treatment effect (SATE). Researchers seeking to generalize to a broader population can weight to estimate the population average treatment effect (PATE), but should discuss the construction and application of weights in a detailed and transparent manner given the possibility that weighting can introduce bias.

Research Article
Copyright © The Experimental Research Section of the American Political Science Association 2017 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Ansolabehere, S. and Schaffner, B. F.. 2014. “Does Survey Mode Still Matter? Findings from a 2010 Multi-Mode Comparison.” Political Analysis 22 (3): 285303.CrossRefGoogle Scholar
Baker, A. 2015. “Race, Paternalism, and Foreign Aid: Evidence from U.S. Public Opinion.” American Political Science Review 109 (1): 93109.CrossRefGoogle Scholar
Berinsky, A. J., Huber, G. A., and Lenz, G. S.. 2012. “Evaluating Online Labor Markets for Experimental Research:'s Mechanical Turk.” Political Analysis 20 (3): 351368.CrossRefGoogle Scholar
Bethlehem, J. G. 1988. “Reduction of Nonresponse Bias through Regression Estimation.” Journal of Official Statistics 4 (3): 251260.Google Scholar
Brick, J. M. 2013. “Unit Nonresponse and Weighting Adjustments: A Critical Review.” Journal of Official Statistics 29 (3): 329353.CrossRefGoogle Scholar
Brick, J. M. and Williams, D.. 2013. “Explaining Rising Nonresponse Rates in Cross-Sectional Surveys.” The Annals of the American Academy of Political and Social Science 645 (1): 3659.CrossRefGoogle Scholar
Brick, J. M. and Kalton, G.. 1996. “Handling Missing Data in Survey Research.” Statistical Methods in Medical Research 5 (3): 215238.CrossRefGoogle ScholarPubMed
Casey, K., Glennerster, R., and Miguel, E.. 2012. “Reshaping Institutions: Evidence on Aid Impacts Using a Preanalysis Plan.” Quarterly Journal of Economics 127 (4): 17551812.CrossRefGoogle Scholar
Clinton, J. D. and Lapinski, J. S.. 2004. “‘Targeted’ Advertising and Voter Turnout: An Experimental Study of the 2000 Presidential Election.” Journal of Politics 66 (1): 6996.CrossRefGoogle Scholar
Cochran, W. G. 1977. Sampling Techniques. New York: John Wiley & Sons.Google Scholar
Cole, S. R. and Stuart, E. A.. 2010. “Generalizing Evidence from Randomized Clinical Trials to Target Populations: The ACTG 320 Trial.” American Journal of Epidemiology 172 (1): 107115.CrossRefGoogle ScholarPubMed
Franco, A., Malhotra, N., and Simonovits, G.. 2015. “Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results.” Political Analysis 23 (2): 306312.CrossRefGoogle Scholar
Gelman, A. 2007. “Struggles with Survey Weighting and Regression Modeling.” Statistical Science 22 (2): 153164.CrossRefGoogle Scholar
Gerber, A., Arceneaux, K., Boudreau, C., Dowling, C., Hillygus, S., Palfrey, T., Biggers, D. R., and Hendry, D. J.. 2014. “Reporting Guidelines for Experimental Research: A Report from the Experimental Research Section Standards Committee.” Journal of Experimental Political Science 1 (1): 8198.CrossRefGoogle Scholar
Harbridge, L. and Malhotra, N.. 2011. “Electoral Incentives and Partisan Conflict in Congress: Evidence from Survey Experiments.” American Journal of Political Science 55 (3): 494510.CrossRefGoogle Scholar
Hartman, E., Grieve, R., Ramsahai, R., and Sekhon, J. S.. 2015. “From SATE to PATT: Combining Experimental with Observational Studies to Estimate Population Treatment Effects.” Journal of the Royal Statistical Society, Series A. (forthcoming). doi: 10, 1111.Google Scholar
Imai, K., King, G., and Stuart, E. A.. 2008. “Misunderstandings between Experimentalists and Observationalists about Causal Inference.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 171 (2): 481502.CrossRefGoogle Scholar
Kalton, G. and Maligalig, D. S.. 1991. A Comparison of Methods of Weighting Adjustment for Nonresponse, Proceedings of the 1991 Annual Research Conference, U.S. Bureau of the Census, 409428.Google Scholar
Kalton, G. and Flores-Cervantes, I.. 2003. “Weighting Methods.” Journal of Official Statistics 19 (2): 8197.Google Scholar
Kim, N., Krosnick, J., and Casasanto, D.. 2015. “Moderators of Candidate Name-Order Effects in Elections: An Experiment.” Political Psychology 36 (5): 525–42.CrossRefGoogle Scholar
Little, R. J. A. and Rubin, D. B.. 2002. Statistical Analysis with Missing Data. New York: John Wiley & Sons.CrossRefGoogle ScholarPubMed
Malhotra, N. and Krosnick, J. A.. 2007. “The Effect of Survey Mode and Sampling on Inferences about Political Attitudes and Behavior: Comparing the 2000 and 2004 ANES to Internet Surveys with Nonprobability Samples.” Political Analysis 15 (3): 286323.CrossRefGoogle Scholar
Miratrix, L. W., Sekhon, J. S., and Theodoridis, A. G.. 2014. “Why You Should (Usually) Post-Stratify on Sample Weights in Survey Experiments.” Paper presented at the Annual Meeting of the Society for Political Methodology. Athens, GA.Google Scholar
Miratrix, L. W., Sekhon, J. S., and Yu, B.. 2013. “Adjusting Treatment Effect Estimates by Post-Stratification in Randomized Experiments.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 75 (2): 369396.CrossRefGoogle Scholar
Mutz, D. C. 2011. Population-Based Survey Experiments. Princeton, NJ: Princeton University Press.CrossRefGoogle Scholar
Shin, H. 2012. “A Cautionary Note on Post-Stratification Adjustment.” Paper presented at the Section on Survey Research Methods, Joint Statistical meeting (JSM). San Diego, CA.Google Scholar
Simmons, J. P., Nelson, L. D., and Simonsohn, U.. 2011. “False-Positive Psychology: Undisclosed Flexibility Data Collection and Analysis Allows Presenting Anything as Significant.” Psychological Science 22 (11): 13591366.CrossRefGoogle ScholarPubMed
Sniderman, P. M., Brody, R. A., and Tetlock, P. E.. 1991. Reasoning and Choice: Explorations in Political Psychology. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Vavreck, L. and Rivers, D.. 2008. “The 2006 Cooperative Congressional Election Study.” Journal of Elections, Public Opinion and Parties 18 (4): 355–66.CrossRefGoogle Scholar
Xenos, M. A., and Becker, A. B.. 2009. “Moments of Zen: Effects of The Daily Show on Information Seeking and Political Learning.” Political Communication 26 (3): 317332.CrossRefGoogle Scholar
Yeager, D. S., Krosnick, J. A., Chang, L., Javitz, H. S., Levendusky, M. S., Simpser, A., and Wang, R.. 2011. “Comparing the Accuracy of RDD Telephone Surveys and Internet Surveys Conducted with Probability Samples and Non-Probability Samples.” Public Opinion Quarterly 75 (4): 709747.CrossRefGoogle Scholar
Supplementary material: PDF

Franco et al supplementary material

Online Appendix

Download Franco et al supplementary material(PDF)
PDF 786 KB
Cited by

Send article to Kindle

To send this article to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

Note you can select to send to either the or variations. ‘’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Developing Standards for Post-Hoc Weighting in Population-Based Survey Experiments
Available formats

Send article to Dropbox

To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

Developing Standards for Post-Hoc Weighting in Population-Based Survey Experiments
Available formats

Send article to Google Drive

To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

Developing Standards for Post-Hoc Weighting in Population-Based Survey Experiments
Available formats

Reply to: Submit a response

Please enter your response.

Your details

Please enter a valid email address.

Conflicting interests

Do you have any conflicting interests? *