Skip to main content Accessibility help
Hostname: page-component-797576ffbb-gvrqt Total loading time: 0 Render date: 2023-12-04T04:55:36.626Z Has data issue: false Feature Flags: { "corePageComponentGetUserInfoFromSharedSession": true, "coreDisableEcommerce": false, "useRatesEcommerce": true } hasContentIssue false

7 - Comparing Logistic Regression, Multinomial Regression, Classification Trees and Random Forests Applied to Ternary Variables

Three-Way Genitive Variation in English

from Part III - Perspectives on Multifactorial Methods

Published online by Cambridge University Press:  06 May 2022

Ole Schützler
Universität Leipzig
Julia Schlüter
Universität Bamberg
Get access


The authors apply logistic regression, multinomial regression, classification trees and random forests to a ternary outcome variable: the variation between the ’s-genitive, the of-genitive and functionally equivalent noun + noun combinations. The statistical approaches discussed fall into regression models on the one hand and classification trees on the other. Specifically, as an alternative to successive binomial regression analyses, the authors implement a multinomial model, which can analyse the entire dataset with three outcome categories simultaneously. Further, a basic classification tree is calculated alongside a more complex (and more robust) random forest. The chapter does not only weigh advantages and shortcomings of all four models, but it also explicates the different rationales and interpretations that come with them. As a major insight, it emerges that the nature of the dataset, the analytic purpose and the statistical model are interdependent and condition each other in several non-trivial respects.

Data and Methods in Corpus Linguistics
Comparative Approaches
, pp. 194 - 223
Publisher: Cambridge University Press
Print publication year: 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Further Reading

Agresti, Alan. 2013. Categorical Data Analysis. Hoboken, NJ: John Wiley & Sons, Inc.Google Scholar
James, Gareth, Daniela Witten, , Trevor Hastie, and Robert Tibshirani, . 2013. An Introduction to Statistical Learning with Applications in R. New York: Springer.CrossRefGoogle Scholar
Vanderschueren, Clara, and Ludovic De Cuypere, . 2014. The Inflected/Non-Inflected Infinitive Alternation in Portuguese Adverbial Clauses. A Corpus Analysis. Language Sciences 41. 153–74.CrossRefGoogle Scholar


Agresti, Alan. 2013. Categorical Data Analysis. Hoboken, NJ: John Wiley & Sons, Inc.Google Scholar
Baayen, R. Harald. 2008. Analyzing Linguistic Data: A Practical Introduction to Statistics Using R. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan, and Finegan, Edward. 1999. Longman Grammar of Spoken and Written English. Harlow: Longman.Google Scholar
Breiman, Leo, Friedman, Jerome, Olshen, Richard and Stone, Charles. 1984. Classification and Regression Trees. Boca Raton, FL: Chapman & Hall.Google Scholar
Chipman, Hugh, George, Edward and Richard, McCulloch. 2010. BART: Bayesian Additive Regression Trees. The Annals of Applied Statistics 4(1). 266–98.CrossRefGoogle Scholar
Feist, Jim. 2012. What Controls the “Genitive Variation” in Present-Day English? Studies in Language 36(2). 261–99.CrossRefGoogle Scholar
Fox, John, Weisberg, Sandford, Price, Brad, Friendly, Michael and Hong, Jangman. 2019. effects: Effect Displays for Linear, Generalized Linear, and Other Models. R package, version 4.1–4. Scholar
Gries, Stefan Th. 2013. Statistics for Linguistics with R: A Practical Introduction. Berlin: de Gruyter Mouton.CrossRefGoogle Scholar
Hastie, Trevor, Tibshirani, Robert and Friedman, Jerome. 2015. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer.Google Scholar
Hinrichs, Lars, Szmrecsanyi, Benedikt and Bohmann, Axel. 2015. Which-Hunting and the Standard English Relative Clause. Language 91(4). 806–36.CrossRefGoogle Scholar
Kuhn, Max, and Johnson, Kjell. 2013. Applied Predictive Modeling. New York: Springer.CrossRefGoogle Scholar
Labov, William. 1969. Contraction, Deletion, and Inherent Variability of the English Copula. Language 45(4). 715–62.CrossRefGoogle Scholar
Labov, William 1982. Building on Empirical Foundations. In Lehmann, Winfred P. and Malkiel, Yakov, eds. Perspectives on Historical Linguistics. Amsterdam and Philadelphia: John Benjamins. 1792.CrossRefGoogle Scholar
Liaw, Andy, and Wiener, Matthew. 2002. Classification and Regression by randomForest. R News 2(3). 1822.Google Scholar
Rickford, John, Ball, Arnetha, Blake, Renee, Jackson, Raina and Martin, Nomi. 1991. Rappin on the Copula Coffin: Theoretical and Methodological Issues in the Analysis of Copula Variation in African-American Vernacular English. Language Variation and Change 3. 103–32.CrossRefGoogle Scholar
Ripley, Brian. 2018. tree: Classification and Regression Trees. R package, version 1.0–39. Scholar
Rosenbach, Anette. 2014. English Genitive Variation: The State of the Art. English Language and Linguistics 18(2). 215–62.CrossRefGoogle Scholar
Röthlisberger, Melanie, Grafmiller, Jason and Szmrecsanyi, Benedikt. 2017. Cognitive Indigenization Effects in the English Dative Alternation. Cognitive Linguistics. 28(4). 673710.CrossRefGoogle Scholar
Sankoff, David, and Rousseau, Pascale. 1989. Statistical Evidence for Rule Ordering. Language Variation and Change 1(1). 118.CrossRefGoogle Scholar
Szmrecsanyi, Benedikt, Biber, Douglas, Egbert, Jesse and Franco, Karlien. 2016. Toward More Accountability: Modeling Ternary Genitive Variation in Late Modern English. Language Variation and Change 28. 129.CrossRefGoogle Scholar
Tagliamonte, Sali. 2006. Analysing Sociolinguistic Variation. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Venables, William, and Ripley, Brian. 2002. Modern Applied Statistics with S. New York: Springer.CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the or variations. ‘’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats