Skip to main content Accessibility help
Hostname: page-component-5c569c448b-4wdfl Total loading time: 1.587 Render date: 2022-07-04T07:27:02.390Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "useRatesEcommerce": false, "useNewApi": true } hasContentIssue true

27 - Data Science as a New Foundation for Insightful, Reproducible, and Trustworthy Social Science

from Part VI - Technology in Statistics and Research Methods

Published online by Cambridge University Press:  18 February 2019

Richard N. Landers
University of Minnesota
Get access


Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'
Publisher: Cambridge University Press
Print publication year: 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Baker, M. (2016). Is there a reproducibility crisis? Nature, 533(7604), 35.Google Scholar
Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python: analyzing text with the natural language toolkit. Sebastopol, CA: O’Reilly Media.Google Scholar
Bogomolov, A., Lepri, B., Ferron, M., Pianesi, F., & Pentland, A. S. (November, 2014). Daily stress recognition from mobile phone data, weather conditions and individual traits. In Proceedings of the 22nd ACM international conference on Multimedia (pp. 477–486). ACM.CrossRef
Bollen, K., Cacioppo, J. T., Kaplan, R. M., Krosnick, J. A., & Olds, J. L. (2015). Social, behavioral, and economic sciences perspectives on robust and reliable science. Retrieved from
Bosco, F. A., Steel, P., Oswald, F. L., Uggerslev, K., & Field, J. G. (2015). Cloud-based meta-analysis to bridge science and practice: Welcome to metaBUS. Personnel Assessment and Decisions, 1, 317.CrossRefGoogle Scholar
Brieman, L. (2001). Statistical modeling: The two cultures. Statistical Science, 16, 199231.CrossRefGoogle Scholar
Buitinck, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel, O.,… Varoquaux, G. (2013). API design for machine learning software: experiences from the scikit-learn project. Paper presented at the European Conference on Machine Learning and Principles and Practices of Knowledge Discovery in Databases, Prague, Czech Republic.
Carter, N. T., Carter, D. R., & DeChurch, L. A. (2015). Implications of observability for the theory and measurement of emergent team phenomena. Journal of Management, 0149206315609402.
Chacon, S. & Straub, B. (2014). Pro git [pdf version]. Retrieved from
Chen, D. & Zhao, H. (March, 2012). Data security and privacy protection issues in cloud computing. In International Conference Computer Science and Electronics Engineering Proceedings (ICCSEE), 2012 (vol. 1, pp. 647–651). IEEE.CrossRef
Chen, X., Cho, Y., & Jang, S. Y. (April, 2015). Crime prediction using Twitter sentiment and weather. In Systems and Information Engineering Design Symposium Conference Proceedings (SIEDS), 2015 (pp. 63–68). IEEE.CrossRef
Crocker, J. (2011). The road to fraud starts with a single step. Nature, 479, 151.CrossRefGoogle ScholarPubMed
Denzin, N. K. (1970). The research act: A theoretical introduction to sociological methods. Chicago, IL: Aldine.Google Scholar
Dickersin, K. (1990). The existence of publication bias and risk factors for its occurrence. Jama, 263(10), 13851389.CrossRefGoogle ScholarPubMed
Dumbill, E. (2013). Making sense of big data. Big Data, 1(1), 12.CrossRefGoogle ScholarPubMed
Earp, B. D. & Trafimow, D. (2015). Replication, falsification, and the crisis of confidence in social psychology. Frontiers in Psychology, 6, 621.CrossRefGoogle ScholarPubMed
Finkel, E. J., Eastwick, P. W., & Reis, H. T. (2017). Replicability and other features of a high-quality science: Toward a balanced and empirical approach. Journal of Personality and Social Psychology, 113(2), 244.CrossRefGoogle Scholar
Furht, B. & Escalante, A. (2010). Handbook of cloud computing (vol. 3). New York, NY: Springer.CrossRefGoogle Scholar
Gatica-Perez, D. (2009). Automatic nonverbal analysis of social interaction in small groups: A review. Image and Vision Computing, 27(12), 17751787.CrossRefGoogle Scholar
Giles, J. (2012). Making the links. From e-mails to social networks, the digital traces left by the life in the modern world are transforming social science. Nature, 488(7412), 448450. doi:10.1038/488448a.CrossRefGoogle Scholar
Guzzo, R. A., Fink, A. A., King, E., Tonidandel, S., & Landis, R. S. (2015). Big data recommendations for industrial–organizational psychology. Industrial and Organizational Psychology, 8, 491508.CrossRefGoogle Scholar
Hambrick, D. C. (2007). Upper echelons theory: An update. Academy of Management Review, 32, 334343.CrossRefGoogle Scholar
Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 47, 98115.CrossRefGoogle Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd edn.). New York, NY: Springer.CrossRefGoogle Scholar
Hernandez, I., Newman, D., & Jeon, G. (2016). Twitter analysis: Methods for data management and validation of a word count dictionary to measure city-level job satisfaction. In Tonidandel, S., King, E., & Cortina, J. (Eds.), Big data at work: The data science revolution and organizational psychology (pp. 64114). New York, NY: Routledge.Google Scholar
Highhouse, S. & Schmitt, N.W. (2013). A snapshot in time: Industrial-organizational psychology today. In Weiner, I. B. (Ed.), Handbook of psychology (2nd edn., pp. 313). Hoboken, NJ: John Wiley & Sons.Google Scholar
Hitzler, P. & Janowicz, K. (2013). Linked data, big data, and the 4th paradigm. Semantic Web, 4, 233235.Google Scholar
Howell, L. (2013). Digital wildfires in a hyperconnected world. WEF Report 2013. Retrieved from
Huang, L. & Knight, A. P. (2017). Resources and relationships in entrepreneurship: an exchange theory of the development and effects of the entrepreneur-investor relationship. Academy of Management Review, 42, 80102.CrossRefGoogle Scholar
Hung, H. & Gatica-Perez, D. (2010). Estimating cohesion in small groups using audio-visual nonverbal behavior. IEEE Transactions on Multimedia, 12(6), 563575.CrossRefGoogle Scholar
Jalali, A., Olabode, O. A., & Bell, C. M. (2012). Leveraging cloud computing to address public health disparities: An analysis of the SPHPS. Online Journal of Public Health Informatics, 4.CrossRefGoogle ScholarPubMed
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning with applications in R. New York, NY: Springer.CrossRefGoogle Scholar
Jick, T. D. (1979). Mixing qualitative and quantitative methods: Triangulation in action. Administrative Science Quarterly, 24(4), 602611.CrossRefGoogle Scholar
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524532. doi:10.1177/0956797611430953.CrossRefGoogle ScholarPubMed
Johnson, V. E., Payne, R. D., Wang, T., Asher, A., & Mandal, S. (2017). On the reproducibility of psychological science. Journal of the American Statistical Association, 112, 110. doi:10.1080/01621459.2016.1240079.CrossRefGoogle ScholarPubMed
Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2(3), 196217.CrossRefGoogle ScholarPubMed
Kidwell, M. C., Lazarević, L. B., Baranski, E., Hardwicke, T. E., Piechowski, S., Falkenberg, L. S., … & Errington, T. M. (2016). Badges to acknowledge open practices: a simple, low-cost, effective method for increasing transparency. PLoS Biology, 14(5), e1002456.CrossRefGoogle ScholarPubMed
Kirk, A. (2012). Data visualization: A successful design process. Birmingham, UK: PacktGoogle Scholar
Kosinski, M., Bachrach, Y., Kohli, P., Stillwell, D., & Graepel, T. (2014). Manifestations of user personality in website choice and behaviour on online social networks. Machine Learning, 95, 357380.CrossRefGoogle Scholar
Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering, 160, 324.Google Scholar
Kozlowski, S. W., Chao, G. T., Chang, C. H., & Fernandez, R. (2015). Team dynamics: Using “big data” to advance the science of team effectiveness. In Tonidandel, S., King, E. B., & Cortina, J. M. (Eds.), Big data at work: The data science revolution and organizational psychology (pp. 273309). New York, NY: Routledge.Google Scholar
Kozlowski, S. W., Chao, G. T., Grand, J. A., Braun, M. T., & Kuljanin, G. (2016). Capturing the multilevel dynamics of emergence: Computational modeling, simulation, and virtual experimentation. Organizational Psychology Review, 6, 333.CrossRefGoogle Scholar
Kuhn, M. & Johnson, K. (2013). Applied predictive modeling. New York, NY: Springer.CrossRefGoogle Scholar
Kuhn, M., Wing, J., Weston, S., Williams, A., Keefer, C., … & Hunt, T. (2017). Caret: Classification and Regression Training. R package version 6.0–78.
Landers, R. N. (October, 2016). A Crash Course in Data Visualization Platform Tableau. The Industrial Organizational Psychologist, 55(2).Google Scholar
Landers, R. N., Brusso, R. C., Cavanaugh, K. J., & Collmus, A. B. (2016). A primer on theory-driven web scraping: Automatic extraction of big data from the Internet for use in psychological research. Psychological Methods, 21, 475492.CrossRefGoogle ScholarPubMed
Landers, R. N., Fink, A., & Collmus, A. B. (2017). Using big data to enhance staffing: Vast untapped resources or tempting honeypot? In Farr, J. L. & Tippins, N. T. (Eds.), Handbook of employee selection (2nd edn., pp. 949966). New York, NY: Routledge.CrossRefGoogle Scholar
Laney, D. (2001). 3D data management: Controlling data volume, velocity and variety. META Group Research Note, 6.Google Scholar
Lazer, D., Pentland, A. S., Adamic, L., Aral, S., Barabasi, A. L., Brewer, D., … & Jebara, T. (2009). Life in the network: The coming age of computational social science. Science, 323(5915), 721723.CrossRefGoogle Scholar
LeBel, E. P., Vanpaemel, W., McCarthy, R., & Earp, B., & Elson, M. (2017). A Unified Framework to Quantify the Trustworthiness of Empirical Research. Manuscript under review @ Advances in Methods and Practices in Psychological Science. Retrieved from
Lewis, P., Grierson, J., Weaver, M. (March 24, 2018). Cambridge Analytica academic’s work upset university colleagues. The Guardian. Retrieved from
Locke, E. A. (2007). The case for inductive theory building. Journal of Management, 33(6), 867890.CrossRefGoogle Scholar
Luciano, M. M., Mathieu, J. E., Park, S., & Tannenbaum, S. I. (2017). A Fitting Approach to Construct and Measurement Alignment: The Role of Big Data in Advancing Dynamic Theories. Organizational Research Methods, 1094428117728372.
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011). Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute.Google Scholar
Marshall, E. (2000). Scientific misconduct. How prevalent is fraud? That’s a million-dollar question. Science, 290(5497), 1662.CrossRefGoogle ScholarPubMed
Maxwell, S. E., Lau, M. Y., & Howard, G. S. (2015). Is psychology suffering from a replication crisis? What does “failure to replicate” really mean? American Psychologist, 70(6), 487498.CrossRefGoogle ScholarPubMed
McAbee, S., Grubbs, J., & Zickar, M. (2018). Open science is robust science. Industrial and Organizational Psychology, 11(1), 5461. doi:10.1017/iop.2017.85.CrossRefGoogle Scholar
McAbee, S. T., Landis, R. S., & Burke, M. I. (2017). Inductive reasoning: The promise of big data. Human Resource Management Review, 27, 277290.CrossRefGoogle Scholar
McKelvey, K. R. & Menczer, F. (February, 2013). Truthy: Enabling the study of online social networks. In Proceedings of the 2013 conference on Computer supported cooperative work companion (pp. 2326). ACM.CrossRef
McKinney, W. (2010). Data Structures for Statistical Computing in Python. In van der Walt, S. & Millman, J. (Eds.), Proceedings of the 9th Python in Science Conference (pp. 5156).
McKnight, K. M., Sechrest, L., & McKnight, P. E. (2005). Psychology, psychologists, and public policy. Annual Review of Clinical Psychology, 1, 557576.CrossRefGoogle ScholarPubMed
Mell, P. & Grance, T. (2011). The NIST definition of cloud computing. National Institute of Standards and Technology, U.S. Department of Commerce.CrossRef
Miguel, E., Camerer, C., Casey, K., Cohen, J., Esterling, K. M., Gerber, A., … & Laitin, D. (2014). Promoting transparency in social science research. Science, 343(6166), 3031.CrossRefGoogle ScholarPubMed
Mourtada, R. & Salem, F. (2011). Civil movements: The impact of Facebook and Twitter. Arab Social Media Report, 1(2), 130.Google Scholar
Ohm, P. (2010). Broken promises of privacy: Responding to the surprising failure of anonymization. UCLA Law Review, 57, 17011776.Google Scholar
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716. doi:10.1126/science.aac4716.CrossRef
Pandey, A. V., Manivannan, A., Nov, O., Satterthwaite, M., & Bertini, E. (2014). The persuasive power of data visualization. IEEE Transactions on Visualization and Computer Graphics, 20, 22112220.CrossRefGoogle ScholarPubMed
Park, C. L. (2010). Making sense of the meaning literature: an integrative review of meaning making and its effects on adjustment to stressful life events. Psychological Bulletin, 136, 257.CrossRefGoogle ScholarPubMed
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 28252830.Google Scholar
Phillips, G. W. & Jiang, T. (2016). Measurement error and equating error in power analysis. Practical Assessment, Research & Evaluation, 21 (9), 112.Google Scholar
Putka, D. J., Beatty, A. S., & Reeder, M. C. (2017). Modern prediction methods: New perspectives on a common problem. Organizational Research Methods, 1094428117697041.
R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
Rissman, J., Greely, H. T., & Wagner, A. D. (2010). Detecting individual memories through the neural decoding of memory states and past experience. Proceedings of the National Academy of Sciences, USA, 107, 98499854. doi:10.1073/ pnas.1001028107.CrossRefGoogle ScholarPubMed
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin.Google Scholar
Schmidt, F. L. & Hunter, J. E. (2003). History, development, evolution, and impact of validity generalization and meta-analysis methods, 1975–2001. In Validity generalization: A critical review (pp. 3165).
Shute, V. J. (2011). Stealth assessment in computer-based games to support learning. Computer Games and Instruction, 55(2), 503524.Google Scholar
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 13591366.CrossRefGoogle ScholarPubMed
Sinar, E. F. (2015). Data visualization. In Tonidandel, S., King, E. B., & Cortina, J. M. (Eds.), Big data at work: The data science revolution and organizational psychology (pp. 115157). New York, NY: Routledge.Google Scholar
Sinar, E. F. (2018). Data Visualization: Get Visual to Drive HR’s Impact and Influence. Society for Human Resource Management (SHRM)-Society for Industrial Organizational Psychology (SIOP) Science of HR White Paper Series.
Spector, P. E., Rogelberg, S. G., Ryan, A. M., Schmitt, N., & Zedeck, S. (2014). Moving the pendulum back to the middle: Reflections on and introduction to the inductive research special issue of Journal of Business and Psychology. Journal of Business and Psychology, 29(4), 499502.CrossRefGoogle Scholar
Stanton, J. M. (2013). Introduction to data science. Retrieved from
Stapel, D. (2014). Faking science: A true story of academic fraud. Trans. NJL Brown.). Retrieved from
Stroebe, W., Postmes, T., & Spears, R. (2012). Scientific misconduct and the myth of self-correction in science. Perspectives on Psychological Science, 7(6), 670688.CrossRefGoogle Scholar
Thurmond, V. A. (2001). The point of triangulation. Journal of Nursing Scholarship, 33(3), 253258.CrossRefGoogle ScholarPubMed
Tonidandel, S., King, E. B., & Cortina, J. M. (2016). Big Data methods: Leveraging modern data analytic techniques to build organizational science. Organizational Research Methods, 1094428116677299.
Tumasjan, A., Sprenger, T. O., Sandner, P. G., & Welpe, I. M. (2010). Predicting elections with twitter: What 140 characters reveal about political sentiment. ICWSM, 10, 178185.Google Scholar
Univers. (2012). Levelt: Fraud detected in 55 publications [Blog post]. Retrieved from
van ’t Veer, A. E. & Giner-Sorolla, R. (2016). Pre-registration in social psychology: A discussion and suggested template. Journal of Experimental Social Psychology, 67, 212.CrossRefGoogle Scholar
Wang, L., Wang, G., & Alexander, C. A. (2015). Big data and visualization: Methods, challenges and technology progress. Digital Technologies, 1, 3338.Google Scholar
We Are Social. (January, 2018). Most famous social network sites worldwide as of January 2018, ranked by number of active users (in millions). Retrieved from
Wenzel, R. & Van Quaquebeke, N. (2017). The Double-Edged Sword of Big Data in Organizational and Management Research: A Review of Opportunities and Risks. Organizational Research Methods, 1094428117718627.
Westera, W., Nadolski, R., & Hummel, H. (2014). Serious gaming analytics: What students’ log files tell us about gaming and learning. International Journal of Serious Games, 1(2), 3550.CrossRefGoogle Scholar
Wicherts, J. (2011). Psychology must learn a lesson from fraud case. Nature, 480.CrossRefGoogle Scholar
Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61(7), 726728. ScholarPubMed
Wickham, H (2017). Tidyverse: Easily Install and Load the “Tidyverse.” R package version 1.2.1.
Yarkoni, T. & Westfall, J. (2017). Choosing prediction over explanation in psychology: Lessons from machine learning. Perspectives on Psychological Science, 12, 11001122.CrossRefGoogle ScholarPubMed
Zhu, Y. (2007). Measuring effective data visualization. Advances in Visual Computing, 4842, 652661.CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the or variations. ‘’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats