Book contents
- The Dangerous Art of Text Mining
- The Dangerous Art of Text Mining
- Copyright page
- Dedication
- Contents
- Preface
- Acknowledgments
- Introduction
- Part I Toward a Smarter Data Science
- Part II The Hidden Dimensions of Temporal Experience
- Chapter 6 The Many Windows of the House of the Past
- Chapter 7 Of Memory
- Chapter 8 The Distinctiveness of Certain Eras
- Chapter 9 The Measure of Influence
- Chapter 10 The Fit of Algorithms to Temporal Experience
- Chapter 11 Whither Modernity
- Chapter 12 Attacks on Environmentalists in Congress
- Part III Disciplinary Implications
- Appendix: Notes on Data, Code, Labor, Room for Error, and British History
- Index
Chapter 8 - The Distinctiveness of Certain Eras
from Part II - The Hidden Dimensions of Temporal Experience
Published online by Cambridge University Press: 21 September 2023
- The Dangerous Art of Text Mining
- The Dangerous Art of Text Mining
- Copyright page
- Dedication
- Contents
- Preface
- Acknowledgments
- Introduction
- Part I Toward a Smarter Data Science
- Part II The Hidden Dimensions of Temporal Experience
- Chapter 6 The Many Windows of the House of the Past
- Chapter 7 Of Memory
- Chapter 8 The Distinctiveness of Certain Eras
- Chapter 9 The Measure of Influence
- Chapter 10 The Fit of Algorithms to Temporal Experience
- Chapter 11 Whither Modernity
- Chapter 12 Attacks on Environmentalists in Congress
- Part III Disciplinary Implications
- Appendix: Notes on Data, Code, Labor, Room for Error, and British History
- Index
Summary
This chapter explores how historians distinguish those characteristics distinctive to a given historical era and how quantitative historians investigate the uniqueness of particular periods. Traditionally, historians have reckoned these thematic changes through comprehensive reading about and around adjacent periods. This chapter introduces tf-idf, an algorithm familiar to library science, and shows how tf-idf can be used to index the most distinctive qualities of temporal periods of different scale. In a case study on Hansard’s parliamentary debates, an algorithm for highlighting the distinctive qualities of each era was applied to different timescales ranging from the day to the two-decade period. The results of this algorithmic process show how text mining can highlight how class differentials of access to power played out in terms of parliamentary attention, with the concerns of working-class people and colonized subjects receiving only a fraction of the time allotted to elite concerns over the entire century. It performs this analysis on the names of geographies and ethnicities that formed part of British Empire, demonstrating the greater attention given white subjects than colonized people who were also persons of color.
- Type
- Chapter
- Information
- The Dangerous Art of Text MiningA Methodology for Digital History, pp. 229 - 271Publisher: Cambridge University PressPrint publication year: 2023