Hostname: page-component-848d4c4894-mwx4w Total loading time: 0 Render date: 2024-06-17T20:08:12.244Z Has data issue: false hasContentIssue false

How to Get Better Survey Data More Efficiently

Published online by Cambridge University Press:  11 November 2020

Mollie J. Cohen
Assistant Professor, University of Georgia, Athens, GA30602, USA. Email:, URL:
Zach Warner*
Postdoctoral Research Fellow, Cardiff University, CardiffCF10 3AT, UK. Email:, URL:
Corresponding author Zach Warner


A key challenge facing many large, in-person public opinion surveys is ensuring that enumerators follow fieldwork protocols. Implementing “quality control” processes can improve data quality and help ensure the representativeness of the final sample. Yet while public opinion researchers have demonstrated the utility of quality control procedures such as audio capture and geo-tracking, there is little research assessing the relative merits of such tools. In this paper, we present new evidence on this question using data from the 2016/17 wave of the AmericasBarometer study. Results from a large classification task demonstrate that a small set of automated and human-coded variables, available across popular survey platforms, can recover the final sample of interviews that results when a full suite of quality control procedures is implemented. Taken as a whole, our results indicate that implementing and automating just a few of the many quality control procedures available can streamline survey researchers’ quality control processes while substantially improving the quality of their data.

© The Author(s) 2020. Published by Cambridge University Press on behalf of the Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Edited by Jeff Gill


Arceneaux, T. A. 2007. “Evaluating the Computer Audio-Recorded Interviewing (CARI) Household Wellness Study (HWS) Field Test.” In Proceedings of the American Statistical Association (Survey Research Methods Section), 2811–2818. Alexandria, VA: American Statistical Association.Google Scholar
Bagnall, A., and Cawley, G. C.. 2017. “On the Use of Default Parameter Settings in the Empirical Evaluation of Classification Algorithms.” arXiv:1703.06777v1.Google Scholar
Bennett, A. S. 1948. “Toward a Solution of the ‘Cheater Problem’ among Part-Time Research Investigators.” Journal of Marketing 12(4):470474.Google Scholar
Bhuiyan, M. F., and Lackie, P.. 2016. “Mitigating Survey Fraud and Human Error: Lessons Learned from a Low Budget Village Census in Bangladesh.” IASSIST Quarterly 40(3):2026.CrossRefGoogle Scholar
Biemer, P. P., and Lyberg, L. E.. 2003. Introduction to Survey Quality. Hoboken, NJ: Wiley.CrossRefGoogle Scholar
Biemer, P. P., and Stokes, S. L.. 1989. “The Optimal Design of Quality Control Samples to Detect Interviewer Cheating.” Journal of Official Statistics 5(1):2339.Google Scholar
Birnbaum, B., DeRenzi, B., Flaxman, A. D., and Lesh, N.. 2012. “Automated Quality Control for Mobile Data Collection.” In Proceedings of the 2nd ACM Symposium on Computing for Development, 11–12. Association for Computing Machinery.CrossRefGoogle Scholar
Blasius, J. 2018. “Fabrication of Interview Data.” Quality Assurance in Education 26(2):213226.CrossRefGoogle Scholar
Blasius, J., and Thiessen, V.. 2012. Assessing the Quality of Survey Data. London: Sage.CrossRefGoogle Scholar
Blasius, J., and Thiessen, V.. 2015. “Should We Trust Survey Data? Assessing Response Simplification and Data Fabrication.” Social Science Research 52:479493.Google ScholarPubMed
Blasius, J., and Thiessen, V.. 2018. “Perceived Corruption, Trust, and Interviewer Behavior in 26 European Countries.” Sociological Methods & Research, doi:10.1177/0049124118782554.CrossRefGoogle Scholar
Bredl, S., Storfinger, N., and Menold, N.. 2011. “A Literature Review of Methods to Detect Fabricated Survey Data.” Discussion paper no. 56, Zentrum für internationale Entwicklungs- und Umweltforschung, ZEU, Giessen.Google Scholar
Bredl, S., Winker, P., and Kötschau, K.. 2008. “A Statistical Approach to Detect Cheating Interviewers.” Discussion paper no. 39, Justus-Liebig-Universität Gießen, Zentrum für internationale Entwicklungs- und Umweltforschung (ZEU), December.Google Scholar
Breiman, L. 2001. “Random Forests.” Machine Learning 45(1):532.Google Scholar
Brier, G. W. 1950. “Verification of Forecasts Expressed in Terms of Probability.” Monthly Weather Review 78(1):13.2.0.CO;2>CrossRefGoogle Scholar
Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P.. 2002. “SMOTE: Synthetic Minority Over-sampling Technique.” Journal of Artificial Intelligence Research 16:321357.Google Scholar
Cohen, M. J., and Larrea, S.. 2018. “Assessing and Improving Interview Quality in the 2016/17 AmericasBarometer.” AmericasBarometer Methodological Note IMN002.Google Scholar
Cohen, M. J., and Warner, Z.. 2020a. “Replication Data for: How to Get Better Survey Data More Efficiently.” Code Ocean, V1. .CrossRefGoogle Scholar
Cohen, M. J., and Warner, Z.. 2020b. “Replication Data for: How to Get Better Survey Data More Efficiently.”, Harvard Dataverse, V1, UNF:6:FbP/7vOB8y3qPGbWny8pTg== [fileUNF].Google Scholar
Crespi, L. P. 1945. “The Cheater Problem in Polling.” Public Opinion Quarterly 9 (4):431445.CrossRefGoogle Scholar
De Haas, S., and Winker, P.. 2014. “Identification of Partial Falsifications in Survey Data.” Statistical Journal of the IAOS 30(3):271281.Google Scholar
Eng, J. L. V., et al. 2007. “Use of Handheld Computers with Global Positioning Systems for Probability Sampling and Data Entry in Household Surveys.” American Journal of Tropical Medicine and Hygiene 77(2):393399.Google Scholar
Fernández-Delgado, M., Cernadas, E., and Barro, S.. 2014. “Do We Need Hundreds of Classifiers to Solve Real World Classification Problems?Journal of Machine Learning Research 15(1):31333181.Google Scholar
Finn, A., and Ranchhod, V.. 2017. “Genuine Fakes: The Prevalence and Implications of Data Fabrication in a Large South African Survey.” World Bank Economic Review 31(1):129157.Google Scholar
Gomila, R., Littman, R., Blair, G., and Paluck, E. L. 2017. “The Audio Check: A Method for Improving Data Quality and Detecting Data Fabrication.” Social Psychological and Personality Science 8(4):424433.Google Scholar
Guyon, I., and Elisseeff, A.. 2003. “An Introduction to Variable and Feature Selection.” Journal of Machine Learning Research 3(1):11571182.Google Scholar
Heath, A, Fisher, S., and Smith, S.. 2005. “The Globalization of Public Opinion Research.” Annual Review of Political Science 8:297333.CrossRefGoogle Scholar
Hicks, W. D., Edwards, B., Tourangeau, K., McBride, B., Harris-Kojetin, L. D., and Moss, A. J.. 2010. “Using CARI Tools to Understand Measurement Error.” Public Opinion Quarterly 74(5):9851003.CrossRefGoogle Scholar
Hill, D. W. Jr., and Jones, Z. M.. 2014. “An Empirical Evaluation of Explanations for State Repression.” American Political Science Review 108 (3):661687.CrossRefGoogle Scholar
Krosnick, J. A. 1999. “Survey Research.” Annual Review of Psychology 50: 537567.CrossRefGoogle ScholarPubMed
Kuhn, M. 2008. “Building Predictive Models in R using the caret Package.” Journal of Statistical Software 28(5):126.CrossRefGoogle Scholar
Kuhn, M., and Johnson, K.. 2013. Applied Predictive Modeling. Berlin, Germany: Springer.Google Scholar
Kuriakose, N., and Robbins, M.. 2016. “Don’t Get Duped: Fraud through Duplication in Public Opinion Surveys.” Statistical Journal of the IAOS 32(3):283291.CrossRefGoogle Scholar
Landrock, U. 2017. “Investigation Interviewer Falsifications: A Quasi-experimental Design.” Bulletin of Sociological Methodology 136(1): 520.CrossRefGoogle Scholar
Lupu, N., and Michelitch, K.. 2018. “Advances in Survey Methods for the Developing World.” Annual Review of Political Science 21:195214.CrossRefGoogle Scholar
Menold, N., and Kemper, C. J.. 2014. “How do Real and Falsified Data Differ? Psychology of Survey Response as a Source of Falsification Indicators in Face-to-Face Surveys.” International Journal of Public Opinion Research 26(1):4165.Google Scholar
Menold, N., Winker, P., Storfinger, N., and Kemper, C. J.. 2013. “A Method for Ex-Post Identification of Falsifications in Survey Data.” In Interviewers’ Deviations in Surveys: Impact, Reasons, Detection and Prevention, edited by Winker, P., Menold, N., and Porst, R., 2548. Berlin, Germany: Peter Lang.Google Scholar
Mitchell, S., Fahrney, K., and Strobl, M.. 2009. “Monitoring Field Interviewer and Respondent Interactions Using Computer-Assisted Recorded Interviewing: A Case Study.” Paper presented at the Annual Conference of the American Association for Public Opinion Research (AAPOR).Google Scholar
Mneimneh, Z. et al. 2018. “Case Studies on Monitoring Interviewer Behavior in International and Multinational Surveys.” In Advances in Comparative Survey Methods: Multicultural, Multinational and Multiregional Contexts (3MC), edited by Johnson, T. P., Pennell, B.-E., Stoop, I. A. L., and Dorer, B., 731770. Hoboken, NJ: Wiley.Google Scholar
Montalvo, J. D., Seligson, M. A., and Zechmeister, E. J.. 2018. “Improving Adherence to Area Probability Sample Designs: Using LAPOP’s Remote Interview Geo-locating of Households in real-Time (RIGHT) System.” Americas Barometer Methodological Note IMN004.Google Scholar
Murphy, J., Baxter, R., Eyerman, J., Cunningham, D., and Kennet, J.. 2004. “A System for Detecting Interviewer Falsification.” Paper presented at the Annual Conference of the American Association for Public Opinion Research (AAPOR).Google Scholar
Robbins, M. 2018. “New Frontiers in Detecting Data Fabrication.” In Advances in Comparative Survey Methods: Multicultural, Multinational and Multiregional Contexts (3MC), edited by Johnson, T. P., Pennell, B.-E., Stoop, I. A. L., and Dorer, B., 771806. Hoboken, NJ: Wiley.CrossRefGoogle Scholar
Sarracino, F., and Mikucka, M.. 2017. “Bias and Efficiency Loss in Regression Estimates Due to Duplicated Observations: A Monte Carlo Simulation.” Survey Research Methods 11(1):1744.Google Scholar
Schäfer, C., Schräpler, J.-P., Müller, K.-R., and Wagner, G. G.. 2004. “Automatic Identification of Faked and Fraudulent Interviews in Surveys by Two Different Methods.” Discussion Papers of DIW Berlin 441. Berlin, Germany: German Institute for Economic Research.Google Scholar
Seligson, M., and Morales, D. E. M.. 2015. “Improving the Quality of Survey Data Using CAPI Systems in Developing Countries.” In The Oxford Handbook of Polling and Polling Methods, edited by Atkeson, L. R., and Alvarez, R. M.. Oxford: Oxford University Press.Google Scholar
Simmons, K., Mercer, A., Schwarzer, S., and Kennedy, C.. 2016. “Evaluating a New Proposal for Detecting Data Falsification in Surveys: The Underlying Causes of ‘High Matches’ Between Survey Respondents.” Statistical Journal of the IAOS 32(3):327338.CrossRefGoogle Scholar
Slomczynski, K. M., Powalko, P., and Krauze, T.. 2017. “Non-unique Records in International Survey Projects: The Need for Extending Data Quality Control.” Survey Research Methods 11(1): 116.Google Scholar
Stokes, L., and Jones, P.. 1989. “Evaluation of the Interviewer Quality Control Procedure for the Post-Enumeration Survey.” Proceedings of the American Statistical Association (Survey Research Methods Section), 696–698.Google Scholar
Storfinger, N., and Winker, P.. 2011. “Robustness of Clustering Methods for Identification of Potential Falsifications in Survey Data.” Discussion Papers 57, Justus Liebig University Giessen, Center for International Development and Environmental Research (ZEU).Google Scholar
Swanson, D., Cho, M. J., and Eltinge, J.. 2003. “Detecting Possibly Fraudulent or Error-Prone Survey Data Using Benford’s Law.” Proceedings of the American Statistical Association (Survey Research Methods Section), 4172–4177.Google Scholar
Turner, C., Gribbe, J., Al-Tayyip, A., and Chromy, J.. 2002. Falsification in Epidemiological Surveys: Detection and Remediation. Technical Papers on Health and Behavior Measurement, No. 53. Washington, DC: Research Triangle Institute.Google Scholar
Winker, P.. 2016. “Assuring the Quality of Survey Data: Incentives, Detection and Documentation of Deviant Behavior.” Statistical Journal of the IAOS 32(3):295303.CrossRefGoogle Scholar
Supplementary material: PDF

Cohen and Warner supplementary material

Cohen and Warner supplementary material 1
Download Cohen and Warner supplementary material(PDF)
PDF 531.2 KB
Supplementary material: File

Cohen and Warner supplementary material

Cohen and Warner supplementary material 2
Download Cohen and Warner supplementary material(File)
File 43 KB
Supplementary material: Link

Cohen and Warner Dataset