Hostname: page-component-848d4c4894-tn8tq Total loading time: 0 Render date: 2024-07-08T01:00:45.609Z Has data issue: false hasContentIssue false

A Framework for the Unsupervised and Semi-Supervised Analysis of Visual Frames

Published online by Cambridge University Press:  23 October 2023

Michelle Torres*
Affiliation:
Assistant Professor, Department of Political Science, University of California, Los Angeles, Los Angeles, CA, USA.

Abstract

This article introduces to political science a framework to analyze the content of visual material through unsupervised and semi-supervised methods. It details the implementation of a tool from the computer vision field, the Bag of Visual Words (BoVW), for the definition and extraction of “tokens” that allow researchers to build an Image-Visual Word Matrix which emulates the Document-Term matrix in text analysis. This reduction technique is the basis for several tools familiar to social scientists, such as topic models, that permit exploratory, and semi-supervised analysis of images. The framework has gains in transparency, interpretability, and inclusion of domain knowledge with respect to other deep learning techniques. I illustrate the scope of the BoVW by conducting a novel visual structural topic model which focuses substantively on the identification of visual frames from the pictures of the migrant caravan from Central America.

Type
Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of the Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Edited by: Daniel Hopkins

References

Abrajano, M., and Hajnal, Z. L.. 2017. White Backlash: Immigration, Race, and American Politics. Princeton: Princeton University Press.Google Scholar
Arandjelović, R., and Zisserman, A.. 2012. “Three Things Everyone Should Know to Improve Object Retrieval.” In 2012 IEEE Conference on Computer Vision and Pattern Recognition 29112918. Providence, RI: IEEE.Google Scholar
Barry, A. M. 1997. Visual Intelligence: Perception, Image, and Manipulation in Visual Communication. Albany: SUNY Press.Google Scholar
Bauer, N. M., and Carpinella, C.. 2018. “Visual Information and Candidate Evaluations: The Influence of Feminine and Masculine Images on Support for Female Candidates.” Political Research Quarterly 71 (2): 395407.Google Scholar
Bay, H., Tuytelaars, T., and Van Gool, L.. 2006. “Surf: Speeded Up Robust Features.” In European Conference on Computer Vision, 404417. Berlin–Heidelberg: Springer.Google Scholar
Boussalis, C., Coan, T. G., Holman, M. R., and Müller, S.. 2021. “Gender, Candidate Emotional Expression, and Voter Reactions during Televised Debates.” American Political Science Review 115 (4): 12421257.Google Scholar
Canclini, A., Cesana, M., Redondi, A., Tagliasacchi, M., Ascenso, J., and Cilla, R.. 2013. “Evaluation of Low-Complexity Visual Feature Detectors and Descriptors.” In 2013 18th International Conference on Digital Signal Processing (DSP), 17. Fira, Greece: IEEE.Google Scholar
Cantú, F. 2019. “The Fingerprints of Fraud: Evidence from Mexico’s 1988 Presidential Election.” American Political Science Review 113 (3): 710726.Google Scholar
Chong, D., and Druckman, J. N.. 2007. “A Theory of Framing and Opinion Formation in Competitive Elite Environments.” Journal of Communication 57 (1): 99118.Google Scholar
Csurka, G., Dance, C., Fan, L., Willamowski, J., and Bray, C.. 2004. “Visual Categorization with Bags of Keypoints.” In 8th European Conference on Computer Vision. Vol. 1, 12. Prague, Czech Republic: ECCV.Google Scholar
Dafoe, A., Zhang, B., and Caughey, D.. 2018. “Information Equivalence in Survey Experiments.” Political Analysis 26 (4): 399416.Google Scholar
Dietrich, B. J., Enos, R. D., and Sen, M.. 2019. “Emotional Arousal Predicts Voting on the US Supreme Court.” Political Analysis 27 (2): 237243.Google Scholar
Druckman, J. N., and Nelson, K. R.. 2003. “Framing and Deliberation: How Citizens’ Conversations Limit Elite Influence.” American Journal of Political Science 47 (4): 729745.Google Scholar
Earl, J., Martin, A., McCarthy, J. D., and Soule, S. A.. 2004. “The Use of Newspaper Data in the Study of Collective Action.” Annual Review of Sociology 30: 6580.Google Scholar
Fiske, J., and Hancock, B. H.. 2016. Media Matters: Race & Gender in US Politics. London: Routledge.Google Scholar
Gamson, W. A. 1989. “News as Framing: Comments on Graber.” American Behavioral Scientist 33 (2): 157161.Google Scholar
Gamson, W. A., and Modigliani, A.. 1989. “Media Discourse and Public Opinion on Nuclear Power: A Constructionist Approach.” American Journal of Sociology 95 (1): 137.Google Scholar
Grauman, K., and Darrell, T.. 2005. “The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features.” In Tenth IEEE International Conference on Computer Vision (ICCV ’05) 14581465. Beijing, China: IEEE Computer Society.Google Scholar
Grauman, K., and Darrell, T.. 2007. “The Pyramid Match Kernel: Efficient L earning with Sets of Features.” Journal of Machine Learning Research 8 (Apr): 725760.Google Scholar
Grauman, K., and Leibe, B.. 2011. “Visual Object Recognition.” In Synthesis Lectures on Artificial Intelligence and Machine Learning, Vol. 5, 1181. Kentfield, CA: Morgan & Claypool Publishers.Google Scholar
Grimmer, J., and Stewart, B. M.. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21 (3): 267297.Google Scholar
Grün, Felix, Rupprecht, Christian, Navab, Nassir, and Tombari, Federico. 2016. “A Taxonomy and Library for Visualizing Learned Features in Convolutional Neural Networks.” In Proceedings of the 33rd International Conference on Machine Learning. Vol. 48. New York: JMLR: W&CP. Preprint, arXiv:1606.07757.Google Scholar
Hainmueller, J., and Hopkins, D. J.. 2014. “Public Attitudes toward Immigration.” Annual Review of Political Science 17: 225249.Google Scholar
Hjerm, M. 2007. “Do Numbers Really Count? Group Threat Theory Revisited.” Journal of Ethnic and Migration Studies 33 (8): 12531275.Google Scholar
Homola, J., and Tavits, M.. 2018. “Contact Reduces Immigration-Related Fears for Leftist but Not for Rightist Voters.” Comparative Political Studies 51 (13): 17891820.Google Scholar
Iyengar, S., and Hahn, K. S.. 2009. “Red Media, Blue Media: Evidence of Ideological Selectivity in Media Use.” Journal of Communication 59 (1): 1939.Google Scholar
Jürgens, P., Meltzer, C. E., and Scharkow, M.. 2022. “Age and Gender Representation on German TV: A Longitudinal Computational Analysis.” Computational Communication Research 4 (1): 173207.Google Scholar
Karpathy, A., and Fei-Fei, L.. 2015. “Deep Visual-Semantic Alignments for Generating Image Descriptions.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 31283137. Boston, MA: IEEE.Google Scholar
Knox, D., and Lucas, C.. 2021. “A Dynamic Model of Speech for the Social Sciences.” American Political Science Review 115 (2): 649666.Google Scholar
Kriesi, H. 1995. New Social Movements in Western Europe: A Comparative Analysis, Vol. 5. Minneapolis: University of Minnesota Press.Google Scholar
Krizhevsky, A., Sutskever, I., and Hinton, G. E.. 2012. “Image Net Classification with Deep Convolutional Neural Networks.” In Advances in Neural Information Processing Systems, Vol. 25, 10971105. Lake Tahoe, NV: NIPS.Google Scholar
Lecheler, S., and de Vreese, C. H.. 2013. “What a Difference a Day Makes? The Effects of Repetitive and Competitive News Framing over Time.” Communication Research 40 (2): 147175.Google Scholar
LeCun, Y. and Bengio, Y.. 1995. “Convolutional Networks for Images, Speech, and Time Series.” In The Handbook of Brain Theory and Neural Networks, edited by M. A. Arbib, 255258. Cambridge: MIT Press.Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P.. 1998. “Gradient-Based Learning Applied to Document Recognition.” Proceedings of the IEEE 86 (11): 22782324.Google Scholar
Lu, Y., and Pan, J.. 2022. “The Pervasive Presence of Chinese Government Content on Douyin Trending Videos.” Computational Communication Research 4 (1): 6898.Google Scholar
Mikolajczyk, K., and Schmid, C.. 2005. “A Performance Evaluation of Local Descriptors.” IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (10): 16151630.Google Scholar
Neumann, M., Fowler, E. F., and Ridout, T. N.. 2022. “Body Language and Gender Stereotypes in Campaign Video.” Computational Communication Research 4 (1): 254274.Google Scholar
Oliver, P. E., and Myers, D. J.. 1999. “How Events Enter the Public Sphere: Conflict, Location, and Sponsorship in Local Newspaper Coverage of Public Events.” American Journal of Sociology 105 (1): 3887.Google Scholar
Parry, K. 2011. “Images of Liberation? Visual Framing, Humanitarianism and British Press Photography during the 2003 Iraq Invasion.” Media, Culture & Society 33 (8): 11851201.Google Scholar
Roberts, M. E., et al. 2014. “Structural Topic Models for Open-Ended Survey Responses.” American Journal of Political Science 58 (4): 10641082.Google Scholar
Rosenholtz, R., Li, Y., Mansfield, J., and Jin, Z.. 2005. “Feature Congestion: A Measure of Display Clutter.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 761770. Portland, OR: ACM.Google Scholar
Simonyan, K., Vedaldi, A., and Zisserman, A.. 2014. “Deep inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps.” In Workshop at the International Conference on Learning Representations. Banff: ICLR.Google Scholar
Sivic, J., Russell, B. C., Efros, A. A., Zisserman, A., and Freeman, W. T.. 2005. “Discovering Objects and Their Location in Images.” In Proceedings of the Tenth IEEE International Conference on Computer Vision. Vol. 1, 370377. Nice, France: IEEE.Google Scholar
Sivic, J., and Zisserman, A.. 2003. “Video Google: A Text Retrieval Approach to Object Matching in Videos.” In Proceedings of the Ninth IEEE International Conference on Computer Vision. Vol. 2, 14701477. Beijing, China: IEEE.Google Scholar
Sniderman, P. M., Hagendoorn, L., and Prior, M.. 2004. “Predisposing Factors and Situational Triggers: Exclusionary Reactions to Immigrant Minorities.” American Political Science Review 98 (1): 3549.Google Scholar
Torres, M. 2023a. “Replication Data for: A Framework for the Unsupervised Analysis of Images.” https://doi.org/10.24433/CO.1204365.v1Google Scholar
Torres, M. 2023b. “Replication Data for: A Framework for the Unsupervised Analysis of Images.” https://doi.org/10.7910/DVN/PZYLYUGoogle Scholar
Torres, M., and Cantú, F.. 2022. “Learning to See: Convolutional Neural Networks for the Analysis of Social Science Data.” Political Analysis 30 (1): 113131.Google Scholar
Vigo, D. A. R., Khan, F. S., Van De Weijer, J., and Gevers, T.. 2010. “The Impact of Color on Bag-of-Words Based Object Recognition.” In 2010 20th International Conference on Pattern Recognition, 15491553. Istanbul, Turkey: IEEE.Google Scholar
Williams, W., Nora, A. C., and Wilkerson, J. D.. 2020. Images as Data for Social Science Research: An Introduction to Convolutional Neural Nets for Image Classification. Cambridge: Cambridge University Press.Google Scholar
Zeiler, M. D., and Fergus, R.. 2014. “Visualizing and Understanding Convolutional Networks.” In European Conference on Computer Vision, 818833. Cham: Springer.Google Scholar
Zeiler, M. D., Taylor, G. W., and Fergus, R.. 2011. “Adaptive Deconvolutional Networks for Mid and High Level Feature Learning.” In 2011 International Conference on Computer Vision, 20182025. IEEE.Google Scholar
Zhang, H., and Pan, J.. 2019. “CASM: A Deep-Learning Approach for Identifying Collective Action Events with Text and Image Data from Social Media.” Sociological Methodology 49 (1): 157.Google Scholar
Supplementary material: PDF

Torres supplementary material

Torres supplementary material

Download Torres supplementary material(PDF)
PDF 12 MB
Supplementary material: Link

Torres Dataset

Link