Hostname: page-component-77c89778f8-gq7q9 Total loading time: 0 Render date: 2024-07-19T14:36:02.506Z Has data issue: false hasContentIssue false

Estimating latent traits from expert surveys: an analysis of sensitivity to data-generating process

Published online by Cambridge University Press:  15 July 2021

Kyle L. Marquardt*
School of Politics and Governance and International Center for the Study of Institutions and Development, HSE University, Moscow, Russia
Daniel Pemstein
Political Science and Public Policy, North Dakota State University, Fargo, ND, USA
*Corresponding author. Email:


Models for converting expert-coded data to estimates of latent concepts assume different data-generating processes (DGPs). In this paper, we simulate ecologically valid data according to different assumptions, and examine the degree to which common methods for aggregating expert-coded data (1) recover true values and (2) construct appropriate coverage intervals. We find that the mean and both hierarchical Aldrich–McKelvey (A–M) scaling and hierarchical item-response theory (IRT) models perform similarly when expert error is low; the hierarchical latent variable models (A-M and IRT) outperform the mean when expert error is high. Hierarchical A–M and IRT models generally perform similarly, although IRT models are often more likely to include true values within their coverage intervals. The median and non-hierarchical latent variable models perform poorly under most assumed DGPs.

Research Note
Copyright © The Author(s), 2021. Published by Cambridge University Press on behalf of the European Political Science Association

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Aldrich, JH and McKelvey, RD (1977) A method of scaling with applications to the 1968 and 1972 presidential elections. American Political Science Review 71, 111130.Google Scholar
Bakker, R, de Vries, C, Edwards, E, Hooghe, L, Jolly, S, Polk, J, Rovny, J, Steenbergen, M and Vachudova, MA (2012) Measuring party positions in Europe: the Chapel Hill expert survey trend file, 1999–2010. Party Politics 21, 143152.CrossRefGoogle Scholar
Bakker, R, Jolly, S, Polk, J and Poole, K (2014) The European Common Space: extending the use of anchoring vignettes. The Journal of Politics 76, 10891101.CrossRefGoogle Scholar
Castanho Silva, B and Littvay, L (2019) Comparative research is harder than we thought: regional differences in experts’ understanding of electoral integrity questions. Political Analysis 27, 599604.CrossRefGoogle Scholar
Clinton, JD and Lewis, DE (2008) Expert opinion, agency characteristics, and agency preferences. Political Analysis 16, 320.Google Scholar
Coppedge, M, Gerring, J, Knutsen, CH, Lindberg, SI, Skaaning, S-E, Teorell, J, Altman, D, Bernhard, M, Fish, MS, Cornell, A, Dahlum, S, Gjerløw, H, Glynn, A, Hicken, A, Krusell, J, Lührmann, A, Marquardt, KL, McMann, K, Mechkova, V, Medzihorsky, J, Olin, M, Paxton, P, Pemstein, D, Pernes, J, von Römer, J, Seim, B, Sigman, R, Staton, J, Stepanova, N, Sundström, A, Tzelgov, E, Wang, Y, Wilson, S and Ziblatt, D (2018) V–Dem Dataset v8. Varieties of Democracy Project.CrossRefGoogle Scholar
Davenport, C and Armstrong II, DA (2004) Democracy and the violation of human rights: a statistical analysis from 1976 to 1996. American Journal of Political Science 48, 538554.CrossRefGoogle Scholar
Fariss, CJ (2017) Are things really getting better? How to validate latent variable models of human rights. British Journal of Political Science 48, 275282.CrossRefGoogle Scholar
Gerring, J (2012) Mere description. British Journal of Political Science 42, 721746.Google Scholar
Hare, C, Armstrong, DA, Bakker, R, Carroll, R and Poole, KT (2015) Using Bayesian Aldrich–McKelvey scaling to study citizens’ ideological preferences and perceptions. American Journal of Political Science 59, 759774.CrossRefGoogle Scholar
Jones, ZM and Lupu, Y (2018) Is there more violence in the middle?. American Journal of Political Science 62, 652657.CrossRefGoogle Scholar
Lindstädt, R, Proksch, S-O and Slapin, JB (2020) When experts disagree: response aggregation and its consequences in expert surveys. Political Science Research and Methods 8, 580588.CrossRefGoogle Scholar
Lührmann, A, Maerz, SF, Grahn, S, Alizada, N, Gastaldi, L, Hellmeier, S, Hindle, G and Lindberg, SI (2020) Autocratization surges—resistance grows. Democracy report, Varieties of Democracy Institute (V–Dem).Google Scholar
Marquardt, KL (2020) How and how much does expert error matter? Implications for quantitative peace research. Journal of Peace Research 57, 692700.CrossRefGoogle Scholar
Marquardt, KL and Pemstein, D (2018) IRT models for expert-coded panel data. Political Analysis 26, 431456.CrossRefGoogle Scholar
Marshall, MG and Jaggers, K (2016) Polity IV Project: political regime characteristics and transitions, 1800–2015. Technical Report, Center for Systemic Peace.Google Scholar
Norris, P, Frank, RW and Martínez i Coma, F (2013) Assessing the quality of elections. Journal of Democracy 24, 124135.CrossRefGoogle Scholar
Pemstein, D, Marquardt, KL, Tzelgov, E, Wang, Y, Krusell, J and Miri, F (2018) The V–Dem measurement model: latent variable analysis for cross-national and cross-temporal expert-coded data. Varieties of Democracy Institute Working Paper 21(3rd Ed).CrossRefGoogle Scholar
Skaaning, S-E (2019) The Global State of Democracy Indices Methodology. Technical Report, International Institute for Democracy and Electoral Assistance.CrossRefGoogle Scholar
Stan Development Team, (2018) RStan: the R interface to Stan. R package version 2.18.2.Google Scholar
Supplementary material: Link

Marquardt and Pemstein Dataset

Supplementary material: PDF

Marquardt and Pemstein supplementary material

Marquardt and Pemstein supplementary material

Download Marquardt and Pemstein supplementary material(PDF)
PDF 3.3 MB