Hostname: page-component-848d4c4894-ttngx Total loading time: 0 Render date: 2024-05-19T00:24:31.047Z Has data issue: false hasContentIssue false

Characterising postgraduate students’ corpus query and usage patterns for disciplinary data-driven learning

Published online by Cambridge University Press:  19 June 2019

Peter Crosthwaite
The University of Queensland, Australia (
Lillian L.C. Wong
The University of Hong Kong, Hong Kong (
Joyce Cheung
The Hong Kong Polytechnic University, Hong Kong (


Data-driven learning (DDL; Johns, 1991), involving students’ hands-on use of corpora for self-guided language learning, is a methodology now increasingly used in many tertiary contexts to enhance the teaching of disciplinary postgraduate thesis writing. However, there are still few studies tracking students’ actual engagement with corpora for DDL. This mixed-methods study reports on the tracking of students’ corpus use via a purpose-built corpus query and data visualisation platform integrated into a large postgraduate disciplinary thesis writing program at a university in Hong Kong. Data on corpus usage history (e.g. times of access, duration of use), query syntax (e.g. query lexis/phraseology and use of wildcards and part-of-speech tags), query function (e.g. frequency lists/distribution, concordance sorting and collocation) and query filters (e.g. searches by faculty, discipline, or thesis section) were collected from 327 students spanning over 11,000 individual corpus queries. The results show significant interdisciplinary and inter-/intra-user trends and variation in the use of particular corpus functions and query syntax adopted by corpus users. Students varied in the type of knowledge (e.g. domain-specific, language-specific) they were accessing, and frequently went beyond the exemplars of the DDL course materials to generate unique queries under their own initiative. Qualitative case study data from three corpus users’ activity logs also show distinctive individual corpus engagement by query frequency and function. These data provide a clearer insight into what students actually do during DDL and the different directions and trajectories that individual users take as a result of DDL. All accompanying DDL tasks are also included as supplementary materials.

Regular papers
© European Association for Computer Assisted Language Learning 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Anthony, L. (2014) AntConc (Version 3.4.4). Tokyo: Waseda University. Scholar
Anthony, L. (2017) AntFileConverter (Version 1.2.1). Tokyo: Waseda University. Scholar
Baisa, V., and Suchomel, V. (2014) SkELL: Web interface for English language learning. In Horák, A. & Rychlý, P. (eds.), RASLAN 2014: Eighth Workshop on Recent Advances in Slavonic Natural Language Processing (pp. 6370). Brno: NLP Consulting.Google Scholar
Boulton, A. (2015) Applying data-driven learning to the web. In Leńko-Szymańska, A. & Boulton, A. (eds.), Multiple affordances of language corpora for data-driven learning. Amsterdam: John Benjamins, 267295. Scholar
Boulton, A., Carter-Thomas, S., and Rowley-Jolivet, E. (eds.) (2012) Corpus-informed research and learning in ESP: Issues and applications. Amsterdam: John Benjamins.CrossRefGoogle Scholar
Boulton, A., and Cobb, T. (2017 ) Corpus use in language learning: A meta-analysis. Language Learning, 67(2): 348393. Scholar
Centre for Applied English Studies (2017) Introduction to Thesis Writing. Hong Kong: The University of Hong Kong.Google Scholar
Chambers, A. & O’Sullivan, Í. (2004) Corpus consultation and advanced learners’ writing skills in French. ReCALL, 16(1): 158172. Scholar
Charles, M. (2007) Reconciling top-down and bottom-up approaches to graduate writing: Using a corpus to teach rhetorical functions. Journal of English for Academic Purposes, 6(4): 289302. Scholar
Charles, M. (2014) Getting the corpus habit: EAP students’ long-term use of personal corpora. English for Specific Purposes, 35: 3040. Scholar
Charles, M. (2015) Same task, different corpus: The role of personal corpora in EAP classes. In Leńko-Szymańska, A. & Boulton, A. (eds.), Multiple affordances of language corpora for data-driven learning. Amsterdam: John Benjamins, 131153. Scholar
Chen, M., and Flowerdew, J. (2018) Introducing data-driven learning to PhD students for research writing purposes: A territory-wide project in Hong Kong. English for Specific Purposes, 50: 97112. Scholar
Cobb, T., and Boulton, A. (2015) Classroom applications of corpus analysis. In Biber, D. & Reppen, R. (eds.), The Cambridge handbook of English corpus linguistics. Cambridge: Cambridge University Press, 478497. Scholar
Cotos, E. (2014) Enhancing writing pedagogy with learner corpus data. ReCALL, 26(2): 202224. Scholar
Cotos, E., Link, S., and Huffman, S. (2017) Effects of DDL technology on genre learning. Language Learning & Technology, 21(3): 104130.Google Scholar
Crosthwaite, P. (2017) Retesting the limits of data-driven learning: Feedback and error correction. Computer Assisted Language Learning, 30(6): 447473. Scholar
Flowerdew, L. (2015) Data-driven learning and language learning theories: Whither the twain shall meet. In Leńko-Szymańska, A. & Boulton, A. (eds.), Multiple affordances of language corpora for data-driven learning. Amsterdam: John Benjamins, 1536. Scholar
Flowerdew, J. (2016) English for specific academic purposes (ESAP): Making the case. Writing & Pedagogy, 8(1): 532. Scholar
Frankenberg-Garcia, A. (2005) A peek into what today’s language learners as researchers actually do. International Journal of Lexicography, 18(3): 335355. Scholar
Gaskell, D., and Cobb, T. (2004) Can learners use concordance feedback for writing errors? System, 32: 301319. Scholar
Hafner, C. A., and Candlin, C. N. (2007) Corpus tools as an affordance to learning in professional legal education. Journal of English for Academic Purposes, 6: 303318. Scholar
Hyland, K. (2000) Disciplinary discourses: Social interactions in academic writing. Harlow: Longman.Google Scholar
Johns, T. (1991) Should you be persuaded: Two examples of data-driven learning materials. In Johns, T. & King, P. (eds.), Classroom concordancing: English Language Research Journal 4. Birmingham: Centre for English Language Studies, University of Birmingham, 116.Google Scholar
Kilgarriff, A., and Grefenstette, G. (2003) Introduction to the special issue on the web as corpus. Computational Linguistics, 29(3): 333347. Scholar
Kilgarriff, A., Rychly, P., Smrz, P., and Tugwell, D. (2004) The Sketch Engine. Information Technology, 105: 116127.Google Scholar
Lee, D., and Swales, J. (2006) A corpus-based EAP course for NNS doctoral students: Moving from available specialized corpora to self-compiled corpora. English for Specific Purposes, 25(1): 5675. Scholar
Lee, H., Warschauer, M., and Lee, J. H. (2018) The effects of corpus use on second language vocabulary learning: A multilevel meta-analysis. Applied Linguistics. Advance online publication. Scholar
Leńko-Szymańska, A., and Boulton, A. (eds.) (2015) Multiple affordances of language corpora for data-driven learning. Amsterdam: John Benjamins. Scholar
Long, M. H. (1991) Focus on form: A design feature in language teaching methodology. In de Bot, K., Ginsberg, R. B. & Kramsch, C. (eds.), Foreign language research in cross-cultural perspective. Amsterdam: John Benjamins, 3952. Scholar
Luo, Q. (2016) The effects of data-driven learning activities on EFL learners’ writing development. SpringerPlus, 5(1): 1255. ScholarPubMed
Millar, N. (2011) The processing of malformed formulaic language. Applied Linguistics, 32(2): 129148. Scholar
Pérez-Paredes, P., Sánchez-Tornel, M., Alcaraz Calero, J. M., and Jiménez, P. A. (2011) Tracking learners’ actual uses of corpora: Guided vs non-guided corpus consultation. Computer Assisted Language Learning, 24(3): 233253. Scholar
Schmidt, R. W. (1990) The role of consciousness in second language learning. Applied Linguistics, 11(2): 129158. Scholar
Steel, C. (2012) Fitting learning into life: Language students’ perspectives on benefits of using mobile apps. In Brown, M., Hartnett, M. & Stewart, T. (eds.), Future challenges, sustainable futures. Proceedings ASCILITE. Wellington: Massey University, 875880.Google Scholar
Steel, C. H., and Levy, M. (2013) Language students and their technologies: Charting the evolution 2006–2011. ReCALL, 25(3): 306320. Scholar
Widmann, J., Koh, K., and Ziai, R.(2011) The SACODEYL search tool: Exploiting corpora for language learning purposes. In Frankenberg-Garcia, A., Flowerdew, L. & Aston, G. (eds.), New trends in corpora and language learning. London: Continuum, 167178.Google Scholar
Yoon, H. (2008) More than a linguistic reference: The influence of corpus technology on L2 academic writing. Language Learning & Technology, 12(2): 3148.Google Scholar
Yoon, H., and Hirvela, A. (2004) ESL student attitude toward corpus use in L2 writing. Journal of Second Language Writing, 13(4): 257283. Scholar
Supplementary material: File

Crosthwaite et al. supplementary material

Crosthwaite et al. supplementary material 1

Download Crosthwaite et al. supplementary material(File)
File 731 KB