Skip to main content Accessibility help
×
Home
Hostname: page-component-558cb97cc8-vrcgq Total loading time: 0.994 Render date: 2022-10-06T19:51:06.738Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "useRatesEcommerce": false, "displayNetworkTab": true, "displayNetworkMapGraph": true, "useSa": true } hasContentIssue true

4 - Comparing Baselines for Corpus Analysis

Research into the Get-Passive in Speech and Writing

from Part II - Selection, Calibration and Preparation of Corpus Data

Published online by Cambridge University Press:  06 May 2022

Ole Schützler
Affiliation:
Universität Leipzig
Julia Schlüter
Affiliation:
Universität Bamberg
Get access

Summary

The authors review different baselines for the study of alternant choices, emphasizing that normalization to a standard number of words – while straightforward in its application – will in many cases not provide a meaningful measure of frequency. Instead, it is argued, we need a baseline indicating opportunities of use, such as phrase or sentence counts. Exemplifying their proposal with reference to get- and be-passives and the presence or absence of agentive by-phrases, the authors demonstrate a sequence of measures taken to make the quantities that are compared more meaningful and defensible, based on linguistically informed selections of baseline quantities (number of main verbs, passives or potentially alternating passives). Crucially, this process must involve a categorization of observations by the researcher to ensure that mutual substitution is plausible in each case. To calibrate this manual data verification exercise to a manageable level, the authors apply a method of uneven category sub-sampling to the data, and use it to adjust variance estimates and confidence intervals in their analysis.

Type
Chapter
Information
Data and Methods in Corpus Linguistics
Comparative Approaches
, pp. 101 - 126
Publisher: Cambridge University Press
Print publication year: 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Further Reading

Hundt, Marianne. 2009. How Often do Things Get V-ed in Philippine and Singapore English? A Case Study of the Get-Passive in Two Outer-Circle Varieties of English. In Bowen, Rhonwen, Mobarg, Mats and Ohlander, Solve, eds. Corpora and Discourse – and Stuff: Papers in Honor of Karin Aijmer. Gothenburg Studies in English 96. Gothenberg: University of Gothenburg. 121–31.Google Scholar
Mehl, Seth. 2018. What We Talk about When We Talk about Corpus Frequency: The Example of Polysemous Verbs with Light and Concrete Senses. Corpus Linguistics and Linguistic Theory. https://doi.org/10.1515/cllt-2017-0039.CrossRefGoogle Scholar
Wallis, Sean. 2021. Statistics in Corpus Linguistics: A New Approach. New York: Routledge.Google Scholar

References

Ball, Catherine. 1994. Automated Text Analysis: Cautionary Tales. Literary and Linguistic Computing 9(4). 295302.CrossRefGoogle Scholar
Banks, David. 1994. Writ in Water: Aspects of the Scientific Journal Article. Brest: Erla, Université de Bretagne Occidentale.Google Scholar
Barber, Charles. 1962. Some Measurable Characteristics of Modern Scientific Prose. In Behre, Frank, ed. Contributions to English Syntax and Philology. Stockholm: Almqvist and Wiksell. 2143.Google Scholar
Biber, Douglas, Finegan, Edward, Johannson, Stig, Conrad, Susan and Leech, Geoffrey. 1999. Longman Grammar of Spoken and Written English. London: Longman.Google Scholar
Downing, Angela. 1996. The Semantics of Get-Passives. In Hasan, Ruqaiya, Cloran, Carmel and Butt, David G., eds. Functional Descriptions. Amsterdam: John Benjamins. 179207.CrossRefGoogle Scholar
Evison, Jane. 2010. What Are the Basics of Analysing a Corpus? In O’Keeffe, Anne and McCarthy, Michael, eds. The Routledge Handbook of Corpus Linguistics. London: Routledge. 122–35.Google Scholar
Fleisher, Nicholas. 2006. The Origin of Passive Get. English Language and Linguistics 10(2). 225–52.CrossRefGoogle Scholar
Greenbaum, Sidney. 1996. Introducing ICE. In Greenbaum, Sidney, ed. Comparing English Worldwide: The International Corpus of English. Oxford: Clarendon Press. 312.Google Scholar
Hatcher, Anna Granville. 1949. To Get/Be Invited. Modern Language Notes 64(7). 433–46.CrossRefGoogle Scholar
Huddleston, Rodney, and Pullum, Geoffrey K. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Huddleston, Rodney, and Pullum, Geoffrey K.. 2005. A Student’s Introduction to English Grammar. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Hundt, Marianne. 2009. How Often to Things Get V-ed in Philippine and Singapore English? A Case Study of the Get-Passive in Two Outer-Circle Varieties of English. In Bowen, Rhonwen, Mobarg, Mats and Ohlander, Solve, eds. Corpora and Discourse – and Stuff: Papers in Honor of Karin Aijmer. Gothenburg Studies in English 96. 121–31.Google Scholar
Jespersen, Otto. 1949. A Modern English Grammar on Historical Principles. Part 7. Copenhagen: E. Munksgaard.Google Scholar
Lavandera, Beatriz. 1978. Where Does the Sociolinguistic Variable Stop? Language in Society 7. 171–83.CrossRefGoogle Scholar
Lakoff, Robin. 1971. Passive Resistance. Papers from the Regional Meeting of the Chicago Linguistic Society 7. 149–62.Google Scholar
Lindquist, Hans. 2009. Corpus Linguistics and the Description of English. Edinburgh: Edinburgh University Press.Google Scholar
McEnery, Tony, and Wilson, Andrew. 2001. Corpus Linguistics. 2nd ed. Edinburgh: Edinburgh University Press.Google Scholar
McEnery, Tony, Xiao, Richard and Tono, Yukio. 2006. Corpus-Based Language Studies: An Advanced Resource Book. New York: Routledge.Google Scholar
Mehl, Seth. 2018. What We Talk about When We Talk about Corpus Frequency: The Example of Polysemous Verbs with Light and Concrete Senses. Corpus Linguistics and Linguistic Theory. https://doi.org/10.1515/cllt-2017-0039.CrossRefGoogle Scholar
Mehl, Seth. 2019. Mapping Lexical Co-occurrence Statistics against a Part of Speech Baseline. In Parviainen, Hannah, Kaunisto, Mark and Pahta, Päivi, eds. Corpus Approaches into World Englishes and Language Contrasts. Helsinki: eVarieng. https://varieng.helsinki.fi/series/volumes/20/mehl/ (accessed 27 March 2021).Google Scholar
Nelson, Gerald, Aarts, Bas and Wallis, Sean. 2002. Exploring Natural Language: Working with the British Component of the International Corpus of English. Amsterdam: John Benjamins.CrossRefGoogle Scholar
Newcombe, Robert. 1998. Two-Sided Confidence Intervals for the Single Proportion: Comparison of Seven Methods. Statistics in Medicine 17. 857–72.3.0.CO;2-E>CrossRefGoogle ScholarPubMed
Schegloff, Emanuel a. 1993. Reflections on Quantification in the Study of Conversation. Research on Language and Social Interaction 26(1). 99128.CrossRefGoogle Scholar
Smith, Nicholas, and Leech, Geoffrey. 2013. Verb Structures in Twentieth-Century British English. In Aarts, Bas, Close, Joanne, Leech, Geoffrey and Wallis, Sean, eds. The Verb Phrase in English: Investigating Recent Language Change with Corpora. Cambridge: Cambridge University Press. 6898.CrossRefGoogle Scholar
Toyota, Junichi. 2008. Diachronic Change in the English Passive. Basingstoke: Palgrave MacMillan.CrossRefGoogle Scholar
Wallis, Sean. 2012a. That Vexed Problem of Choice. London: UCL Survey of English Usage. www.ucl.ac.uk/english-usage/statspapers/vexedchoice.pdf (accessed 27 March 2021).Google Scholar
Wallis, Sean. 2012b. Freedom to Vary and Significance Tests. London: UCL Survey of English Usage. http://corplingstats.wordpress.com/2012/09/30/free-to-vary (accessed 27 March 2021).Google Scholar
Wallis, Sean. 2013. Binomial Confidence Intervals and Contingency Tests. Journal of Quantitative Linguistics 20(3). 178208.CrossRefGoogle Scholar
Wallis, Sean. 2019. Comparing χ2 Tables for Separability of Distribution and Effect. Meta-Tests for Comparing Homogeneity and Goodness of Fit Contingency Test Outcomes. Journal of Quantitative Linguistics 26(4). 330–55.CrossRefGoogle Scholar
Wallis, Sean. 2021. Statistics in Corpus Linguistics: A New Approach. New York: Routledge.Google Scholar
Wilson, Edwin Bidwell. 1927. Probable Inference, the Law of Succession, and Statistical Inference. Journal of the American Statistical Association 22(158). 209–12.CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×