Book contents
- The Dangerous Art of Text Mining
- The Dangerous Art of Text Mining
- Copyright page
- Dedication
- Contents
- Preface
- Acknowledgments
- Introduction
- Part I Toward a Smarter Data Science
- Chapter 1 Why Textual Data from the Past Is Dangerous
- Chapter 2 From Fantasy to Engagement
- Chapter 3 Words Are Keys and Words Are Barriers
- Chapter 4 Critical Search: A Theory
- Chapter 5 To Predict or to Describe?
- Part II The Hidden Dimensions of Temporal Experience
- Part III Disciplinary Implications
- Appendix: Notes on Data, Code, Labor, Room for Error, and British History
- Index
Chapter 4 - Critical Search: A Theory
from Part I - Toward a Smarter Data Science
Published online by Cambridge University Press: 21 September 2023
- The Dangerous Art of Text Mining
- The Dangerous Art of Text Mining
- Copyright page
- Dedication
- Contents
- Preface
- Acknowledgments
- Introduction
- Part I Toward a Smarter Data Science
- Chapter 1 Why Textual Data from the Past Is Dangerous
- Chapter 2 From Fantasy to Engagement
- Chapter 3 Words Are Keys and Words Are Barriers
- Chapter 4 Critical Search: A Theory
- Chapter 5 To Predict or to Describe?
- Part II The Hidden Dimensions of Temporal Experience
- Part III Disciplinary Implications
- Appendix: Notes on Data, Code, Labor, Room for Error, and British History
- Index
Summary
This chapter explores critical thinking about data and algorithms and offers a formula called “critical search.” To develop a critical perspective on the past, the researcher must investigate the “fit” between data, algorithm, secondary sources, and analysis. Recursively iterating through each part of the research process, one develops not so much the one true portrait of the past, but rather a portrait of the past in its multiple dimensions. We see that rigorous methodology produces not certainty, but on the contrary, layers of contingency. This chapter discusses exemplarity, cherrypicking, and common issues of sloppy analysis, and the imperfect grounds upon which historical narratives are built, as they are coded with ethnocentrism and other biases. The bulk of the chapter lays out a threefold method for addressing these issues with critical thinking and an energetically iterative approach. First, through seeding, that is, asking a wide range of essential questions about the data and about methodological approach to the data, and then applying these parameters, as a way to set up the most robust experiment possible. Second, through broad winnowing, the next stage in the experiment, where the scholar pores over the returns of the query to sort signal from noise, sturdy from flimsy, gathering up the promising results, and discarding the less clear or less relevant information. And thirdly, guided reading, in which the research turns to textual sources that can bring new knowledge to the known archives.
- Type
- Chapter
- Information
- The Dangerous Art of Text MiningA Methodology for Digital History, pp. 117 - 134Publisher: Cambridge University PressPrint publication year: 2023