Skip to main content Accessibility help
Hostname: page-component-77c89778f8-fv566 Total loading time: 0 Render date: 2024-07-20T23:04:52.731Z Has data issue: false hasContentIssue false

Appendix A - List of Example Stand-alone Corpus Description Articles

Published online by Cambridge University Press:  07 April 2022

Jesse Egbert
Northern Arizona University
Douglas Biber
Northern Arizona University
Bethany Gray
Iowa State University
Get access


Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'
Designing and Evaluating Language Corpora
A Practical Framework for Corpus Representativeness
, pp. 224 - 225
Publisher: Cambridge University Press
Print publication year: 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Alsop, S., & Nesi, H. 2009. Issues in the development of the British Academic Written English (BAWE) corpus. Corpora 4(1): 7183.CrossRefGoogle Scholar
Baturo, A., Dasandi, N., & Mikhaylov, S. 2017. Understanding state preferences with text as data: Introducing the UN General Debate corpus. Research & Politics 4(2): 19Google Scholar
Bednarek, M. 2020. The Sydney Corpus of Television Dialogue: Designing and building a corpus of dialogue from US TV series. Corpora 15(1): 107–19.Google Scholar
Biber, D., Egbert, J., & Davies, M. 2015. Exploring the composition of the searchable web: a corpus-based taxonomy of web registers. Corpora 10(1): 1145.CrossRefGoogle Scholar
Biber, D., Finegan, E., & Atkinson, D. 1994. ARCHER and its challenges: Compiling and exploring a representative corpus of historical English registers. In Fries, U., Tottie, G., & Schneider, P. (eds.), Creating and Using English Language Corpora (114). Rodopi.Google Scholar
Davies, M. 2009. The 385+ million word Corpus of Contemporary American English (1990–2008+): Design, architecture, and linguistic insights. International Journal of Corpus Linguistics 14(2): 159–90.CrossRefGoogle Scholar
Davies, M. 2012. Expanding horizons in historical linguistics with the 400-million word Corpus of Historical American English. Corpora 7(2): 121–57.Google Scholar
Davies, M. 2014. Creating and using the Corpus do Portugues and the Frequency Dictionary of Portuguese. In Berber Sardinha, T. & Ferreira, T. (eds.), Working with Portuguese Corpora (89110). Continuum.Google Scholar
Davies, M. 2021. The TV and Movies corpora: Design, construction, and use. International Journal of Corpus Linguistics 26(1): 1037.Google Scholar
Davies, M., & Fuchs, R. 2015. Expanding horizons in the study of world Englishes with the 1.9 billion word Global Web-Based English Corpus (GloWbE). English World-Wide 36: 128.Google Scholar
Diemer, S., Brunner, M.-L., & Schmidt, S. 2016. Compiling computer-mediated spoken language corpora: Key issues and recommendations. International Journal of Corpus Linguistics 21(3): 348–71.Google Scholar
Friginal, E. 2009. Chapter 3. Corpora and description of speaker groups in the call center corpus. In The Language of Outsourced Call Centers: A Corpus-Based Study of Cross-cultural Interaction (3973). John Benjamins.Google Scholar
Granger, S. 2003. The International Corpus of Learner English: A new resource for foreign language learning and teaching and second language acquisition research. TESOL Quarterly 37(3): 538–46.CrossRefGoogle Scholar
Grieve, J. 2016. Chapter 2. Corpus. In Regional Variation in Written American English (1635). Cambridge University Press.Google Scholar
Knight, D., Adolphs, S., & Carter, R. 2014. CANELC: Constructing an e-language corpus. Corpora 9(1): 2956.Google Scholar
Krummes, C., & Ensslin, A. 2014. What’s hard in German? WHiG: A British learner corpus of German. Corpora 9(2): 191205.Google Scholar
Llanos, L. C. 2014. A Spanish learner oral corpus for computer-aided error analysis. Corpora 9(2): 207–38.Google Scholar
Love, R., Dembry, C., Hardie, A., Brezina, V., & McEnery, T. 2017. The spoken BNC2014: Designing and building a spoken corpus of everyday conversations. International Journal of Corpus Linguistics 22(3): 319–44.Google Scholar
Nesi, H., Sharpling, G., & Ganobcsik-Williams, L. 2004. Student papers across the curriculum: Designing and developing a corpus of British student writing. Computers and Composition 21(4): 401503.Google Scholar
O’Donnell, M., & Römer, U. 2012. From student to hard drive to web corpus (part 2): The annotation and online distributions of the Michigan Corpus of Upper-Level Student Papers. Corpora 7(1): 118.CrossRefGoogle Scholar
Ohashi, Y., Katagiri, N., Oka, K., & Hanada, M. 2020. ESP corpus design: Compilation of the Veterinary Nursing Medical Chart Corpus and the Veterinary Nursing Wordlist. Corpora 15(2): 125–40.Google Scholar
Pickering, L., Di Ferrante, L., Bruce, C., Friginal, E., Pearson, P., & Bouchard, J. 2019. An introduction to the ANAWC: The AAC and Non-AAC Workplace Corpus. International Journal of Corpus Linguistics 24(2): 229244.CrossRefGoogle Scholar
Puga, K & Götz, S. 2017. “Keep out of reach of children!” Introducing the Corpus of Product Information (CoPI) and its potential for corpus-based genre teaching. Corpora 12(3): 393423.CrossRefGoogle Scholar
Römer, U., & O’Donnell, M. 2011. From student hard drive to web corpus (part 1): The design, compilation and genre classification of the Michigan Corpus of Upper-Level Student Papers (MICUSP). Corpora 6(2): 159–77.CrossRefGoogle Scholar
Rühlemann, C., & O’Donnell, M. 2012. Introducing a corpus of conversational stories. Construction and annotation of the Narrative Corpus.Corpus Linguistics and Linguistic Theory 8(2): 313–50.CrossRefGoogle Scholar
Staples, S. 2015. Chapter 3. Corpus and data analysis. In The Discourse of Nurse-Patient Interactions: Contrasting the Communicative Styles of U.S. and International Nurses. John Benjamins.Google Scholar
Tellings, A., Oostdijk, N., Monster, I., Grootjen, F., & van den Bosch, A. 2018. BasiScript: A corpus of contemporary Dutch texts written by primary school children. International Journal of Corpus Linguistics 23(4): 494508.Google Scholar
Walkden, G. 2016. The HeliPaD: A parsed corpus of Old Saxon. International Journal of Corpus Linguistics 21(4): 559–71.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the or variations. ‘’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats