Political Analysis: Volume 21 - Issue 3

Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts
Part of:
- PA editors' choice articles
Justin Grimmer, Brandon M. Stewart
Published online by Cambridge University Press:

04 January 2017, pp. 267-297
- Article
- - You have access
- PDF
- Export citation
Politics and political conflict often occur in the written and spoken word. Scholars have long recognized this, but the massive costs of analyzing even moderately sized collections of texts have hindered their use in political science research. Here lies the promise of automated text analysis: it substantially reduces the costs of analyzing large collections of text. We provide a guide to this exciting new area of research and show how, in many instances, the methods have already obtained part of their promise. But there are pitfalls to using automated methods—they are no substitute for careful thought and close reading and require extensive and problem-specific validation. We survey a wide range of new methods, provide guidance on how to validate the output of the models, and clarify misconceptions and errors in the literature. To conclude, we argue that for automated text methods to become a standard tool for political scientists, methodologists must contribute new methods and new methods of validation.

Validating Estimates of Latent Traits from Textual Data Using Human Judgment as a Benchmark
Will Lowe, Kenneth Benoit
Published online by Cambridge University Press:

04 January 2017, pp. 298-313
- Article
- - You have access
- PDF
- Export citation
Automated and statistical methods for estimating latent political traits and classes from textual data hold great promise, because virtually every political act involves the production of text. Statistical models of natural language features, however, are heavily laden with unrealistic assumptions about the process that generates these data, including the stochastic process of text generation, the functional link between political variables and observed text, and the nature of the variables (and dimensions) on which observed text should be conditioned. While acknowledging statistical models of latent traits to be “wrong,” political scientists nonetheless treat their results as sufficiently valid to be useful. In this article, we address the issue of substantive validity in the face of potential model failure, in the context of unsupervised scaling methods of latent traits. We critically examine one popular parametric measurement model of latent traits for text and then compare its results to systematic human judgments of the texts as a benchmark for validity.

Modeling Dynamic Preferences: A Bayesian Robust Dynamic Latent Ordered Probit Model
Daniel Stegmueller
Published online by Cambridge University Press:

04 January 2017, pp. 314-333
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Much politico-economic research on individuals' preferences is cross-sectional and does not model dynamic aspects of preference or attitude formation. I present a Bayesian dynamic panel model, which facilitates the analysis of repeated preferences using individual-level panel data. My model deals with three problems. First, I explicitly include feedback from previous preferences taking into account that available survey measures of preferences are categorical. Second, I model individuals' initial conditions when entering the panel as resulting from observed and unobserved individual attributes. Third, I capture unobserved individual preference heterogeneity both via standard parametric random effects and a robust alternative based on Bayesian nonparametric density estimation. I use this model to analyze the impact of income and wealth on preferences for government intervention using the British Household Panel Study from 1991 to 2007.

The Democracy Cluster Classification Index
Mihaiela Ristei Gugiu, Miguel Centellas
Published online by Cambridge University Press:

04 January 2017, pp. 334-349
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Utilizing hierarchical cluster analysis, a new measure of democracy, the DCC index, is proposed and constructed from five popular indices of democracy (Freedom House, Polity IV, Vanahanen's index of democratization, Cheibub et al.'s index of democracy and dictatorship, and the Cingranelli-Richards index of electoral self-determination). The DCC was used to classify the regime types for twenty-four countries in the Americas and thirty-nine countries in Europe over a thirty-year period. The results indicated that democracy is a latent class variable. Sensitivity and specificity analyses were conducted for the five existing democracy indices as well as the newly proposed Unified Democracy Scores index and a predicted DCC score. This analysis revealed significant problems with existing measures. Overall, the predicted DCC index attained the highest level of accuracy although one other index achieved high levels of accuracy in identifying nondemocracies.

Genes and Politics: A New Explanation and Evaluation of Twin Study Results and Association Studies in Political Science
Doron Shultziner
Published online by Cambridge University Press:

04 January 2017, pp. 350-367
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This article offers a new explanation for the results of twin studies in political science that supposedly disclose a genetic basis for political traits. I argue that identical twins tend to be more alike than nonidentical twins because the former are more similarly affected by the same environmental conditions, but the content of those greater trait similarities is nevertheless completely malleable and determined by particular environments. The twin studies method thus can neither prove nor refute the argument for a genetic basis of political traits such as liberal and conservative preferences or voting turnout. The meaning of heritability estimates results in twin studies are discussed, as well as the definition and function of the environment in the political science twin studies. The premature attempts to associate political traits with specific genes despite countertrends in genetics are also examined. I conclude by proposing that the alternative explanation of this article may explain certain puzzles in behavioral genetics, particularly why social and political traits have higher heritability estimates than common physical and medical traits. I map the main point of disagreements with the methodology and the interpretation of its results, and delineate the main operative implications for future research.

Gene-Environment Interplay in Twin Models
Brad Verhulst, Peter K. Hatemi
Published online by Cambridge University Press:

04 January 2017, pp. 368-389
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this article, we respond to Shultziner's critique that argues that identical twins are more alike not because of genetic similarity, but because they select into more similar environments and respond to stimuli in comparable ways, and that these effects bias twin model estimates to such an extent that they are invalid. The essay further argues that the theory and methods that undergird twin models, as well as the empirical studies which rely upon them, are unaware of these potential biases. We correct this and other misunderstandings in the essay and find that gene-environment (GE) interplay is a well-articulated concept in behavior genetics and political science, operationalized as gene-environment correlation and gene-environment interaction. Both are incorporated into interpretations of the classical twin design (CTD) and estimated in numerous empirical studies through extensions of the CTD. We then conduct simulations to quantify the influence of GE interplay on estimates from the CTD. Due to the criticism's mischaracterization of the CTD and GE interplay, combined with the absence of any empirical evidence to counter what is presented in the extant literature and this article, we conclude that the critique does not enhance our understanding of the processes that drive political traits, genetic or otherwise.

Fatal Flaws in the Twin Study Paradigm: A Reply to Hatemi and Verhulst
Doron Shultziner
Published online by Cambridge University Press:

04 January 2017, pp. 390-392
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

PAN volume 21 issue 3 Cover and Front matter
Published online by Cambridge University Press:

04 January 2017, pp. f1-f4
- Article
- - You have access
- PDF
- Export citation

Political Analysis

Refine listing

Actions for selected content:

Volume 21 - Issue 3 - Summer 2013

Research Article

Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts

Validating Estimates of Latent Traits from Textual Data Using Human Judgment as a Benchmark

Modeling Dynamic Preferences: A Bayesian Robust Dynamic Latent Ordered Probit Model

The Democracy Cluster Classification Index

Genes and Politics: A New Explanation and Evaluation of Twin Study Results and Association Studies in Political Science

Gene-Environment Interplay in Twin Models

Reply

Fatal Flaws in the Twin Study Paradigm: A Reply to Hatemi and Verhulst

Front matter

PAN volume 21 issue 3 Cover and Front matter

Political Analysis

Refine listing

Actions for selected content:

Save Search

Volume 21 - Issue 3 - Summer 2013

Research Article

Reply

Front matter