Evidence from corpus linguistics

William A. Kretzschmar, Jr

doi:10.1017/CBO9780511576782.007

5 - Evidence from corpus linguistics

Published online by Cambridge University Press: 03 July 2009

William A. Kretzschmar, Jr

Show author details

William A. Kretzschmar, Jr: Affiliation:
University of Georgia

Book contents

Get access

Summary

The use of computers, through their storage capacity rather than through their processing ability, begins to address Saussure's statement that “language in its totality is unknowable.” At the beginning of the computer age, processing dominated computer applications because memory was quite limited, whether in RAM or on longer-term media like tape or disk. Now, however, mass storage is much more available, so that it is possible to create tremendous corpora of language data, whether as sound files or as text files. At this writing, the use of networked storage arrays allows linguists to build linguistic corpora reaching many terabytes in size, whereas the largest storage device available to the Linguistic Atlas Project when it began to be computerized in the early 1980s was ten megabytes – growth by a factor of a million times over a quarter century. One million words of running text, as in the 1961-vintage Brown Corpus of American English and in the parallel LOB Corpus of British English, and in each of their later replications in the Freiburg-Brown (Frown) Corpus and Freiburg-LOB (FLOB) Corpus of the 1990s (all available in ICAME 1999), occupies about six or seven megabytes, massive storage in the 1960s, most of a computer's hard drive in the mid-1980s, now easily manageable on even the smallest of storage devices. Dictionaries demand the use of large corpora, in which as many different words of the language as possible may be found.

Type: Chapter
Information: The Linguistics of Speech , pp. 146 - 173

DOI: https://doi.org/10.1017/CBO9780511576782.007 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

5 - Evidence from corpus linguistics

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive