Skip to main content Accessibility help
×
Hostname: page-component-7479d7b7d-767nl Total loading time: 0 Render date: 2024-07-13T17:57:29.061Z Has data issue: false hasContentIssue false

3 - Corpora: capturing language in use

from Part I - Investigating variation in English: how do we know what we know?

Published online by Cambridge University Press:  03 May 2011

Alexandra D'Arcy
Affiliation:
University of Victoria
Warren Maguire
Affiliation:
University of Edinburgh
April McMahon
Affiliation:
University of Edinburgh
Get access

Summary

Introduction

Language cannot be invented; it can only be captured.

(Sinclair 1997: 31)

The enterprise of investigating language variation is based on access to empirical data – language as actually used by speakers and writers. This is not trivial. We only know what we do about variation in English (or for that matter, in any variety, dialect, register, etc.) through analysis of language in some collection of materials. This collection, ‘the corpus’, is the foundation of everything we do. The data might consist of a collection of letters and diaries, spoken narratives of personal experience, or a compilation of text logs from instant messaging conversations. The materials that provide data for variation studies are diverse, but what unites them is their empirical validity as representations of language in use and, as a consequence, our dependence on them. The simple truth is that we cannot engage in the study of language variation without access to a corpus of data on which to test our hypotheses, base our analyses, and inform our theories, yet this simple truth masks a number of not-so-simple issues. How are corpora constructed? If a corpus contains spoken language, what is the best way to represent the speech in written format? How are corpora accessed and mined? What methods achieve what results? How should the results be interpreted (i.e. what do they mean, what do they tell us?)? This chapter explores these kinds of questions but it intentionally presents few solutions.

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×