Skip to main content Accessibility help
×
Home

Advancing research in second language writing through computational tools and machine learning techniques: A research agenda

  • Scott A. Crossley (a1)

Abstract

This paper provides an agenda for replication studies focusing on second language (L2) writing and the use of natural language processing (NLP) tools and machine learning algorithms. Specifically, it introduces a range of the available NLP tools and machine learning algorithms and demonstrates how these could be used to replicate seminal studies in L2 writing that concentrate on longitudinal writing development, predicting essay quality, examining differences between L1 and L2 writers, the effects of writing topics, and the effects of writing tasks. The paper concludes with implications for the recommended replication studies in the field of L2 writing and the advantages of using NLP tools and machine learning algorithms.

Copyright

References

Hide All
Arnaud, P. J. L. (1992). Objective lexical and grammatical characteristics of L2 written compositions and the validity of separate-component tests. In Arnaud, P. J. L. & Bejoint, H. (eds.), Vocabulary and applied linguistics. London: Macmillan, 133145.
Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.
Bonzo, J. D. (2008). To assign a topic or not: Observing fluency and complexity in intermediate foreign language writing. Foreign Language Annals 41, 722735.
Brown, C., Snodgrass, T., Kemper, S. J., Herman, R. & Covington, M. A. (2008). Automatic measurement of propositional idea density from part-of-speech tagging. Behavior Research Methods 40.2, 540545.
Bynes, H., Maxim, H. H. & Norris, J. M. (2010). Realizing advanced L2 writing development in a collegiate curriculum: Curricular design, pedagogy, assessment. The Modern Language Journal 94, Monograph Supplement.
Carlman, N. (1986). Topic differences on writing tests: How much do they matter? English Quarterly 19, 3947.
Chung, C. K. & Pennebaker, J. W. (2012). Linguistic Inquiry and Word Count (LIWC): Pronounced ‘Luke’, . . . and other useful facts. In McCarthy, P. M. & Boonthum, C. (eds.), Applied natural language processing and content analysis: Identification, investigation, and resolution. Hershey, PA: IGI Global, 133145.
Cobb, T. & Horst, M. (2011). Does Word Coach coach words? CALICO 28.3, 639661.
Connor, U. (1984). A study of cohesion and coherence in ESL students' writing. Papers in Linguistics: International Journal of Human Communication 17, 301316.
Connor, U. (1990). Linguistic/rhetorical measures for international persuasive student writing. Research in the Teaching of English 24, 6787.
Crossley, S. A., McNamara, D. S., Weston, J. & McLain, S. T.Sullivan (2011). The development of writing proficiency as a function of grade level: A linguistic analysis. Written Communication 28.3, 282311.
Crossley, S. A. & McNamara, D. S. (2009). Computationally assessing lexical differences in L2 writing. Journal of Second Language Writing 17.2, 119135.
Crossley, S. A, Salsbury, T. & McNamara, D. S. (2009). Measuring second language lexical growth using hypernymic relationships. Language Learning 59.2, 307334.
Crossley, S. A., Salsbury, T. & McNamara, D. S. (2010). The development of polysemy and frequency use in English second language speakers. Language Learning 60.3, 573605.
Cumming, A., Kantor, R., Baba, K., Erdoosy, U., Eouanzoui, K. & James, M. (2005). Differences in written discourse in writing-only and reading-to-write prototype tasks for next generation TOEFL. Assessing Writing 10, 543.
Cumming, A., Kantor, R., Baba, K., Erdoosy, U., Eouanzoui, K. & James, M. (2006). Analysis of discourse features and verification of scoring levels for independent and integrated tasks for the new TOEFL (TOEFL Monograph No. MS-30). Princeton, NJ: ETS.
Engber, C. A. (1995). The relationship of lexical proficiency to the quality of ESL compositions. Journal of Second Language Writing 4.2, 139155.
Esmaeili, H. (2002). Integrated reading and writing tasks and ESL students' reading and writing performance in an English language test. The Canadian Modern Language Review 58.4, 599622.
Ferris, D. R. (1994). Lexical and syntactic features of ESL writing by students at different levels of L2 proficiency. TESOL Quarterly 28.2, 414420.
Gebril, A. (2006). Writing-only and reading-to-write academic writing tasks: A study in generalizability and test method. Unpublished doctoral dissertation, University of Iowa.
Graesser, A. C., McNamara, D. S., Louwerse, M. & Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods, Instruments, & Computers 36, 193202.
Granger, S., Dagneaux, E., Meunier, F. & Paquot, M. (2009). The International Corpus of Learner English. Handbook and CD-ROM. Version 2. Louvain-la-Neuve: Presses Universitaires de Louvain.
Grant, L. & Ginther, A. (2000). Using computer-tagged linguistic features to describe L2 writing differences. Journal of Second Language Writing 9, 123145.
Haswell, R. H. (2000). Documenting improvement in college writing: A longitudinal approach. Written Communication 17, 307352.
Higgins, D., Xi, X., Zechner, K. & Williamson, D. (2011). A three-stage approach to the automated scoring of spontaneous spoken responses. Computer Speech and Language 25.2, 282306.
Hinkel, E. (2002). Second language writers’ text. Mahwah, NJ: Lawrence Erlbaum.
Hinkel, E. (2003). Simplicity without elegance: Features of sentences in L1 and L2 academic texts. TESOL Quarterly 37, 275301.
Hinkel, E. (2009). The effects of essay topics on modal verb uses in L1 and L2 academic writing. Journal of Pragmatics 41, 667683.
Horowitz, D. (1986). What professors actually require: Academic tasks for the ESL classroom. TESOL Quarterly 20, 445462.
Just, M. A. & Carpenter, P. A. (1980). A theory of reading: From eye fixations to comprehension. Psychological Review 87, 329354.
Language Teaching Review Panel (2008). Replication studies in language learning and teaching: Questions and answers, Language Teaching 41, 114.
Laufer, B. (1994). The lexical profile of second language writing: Does it change over time? RELC Journal 25.2, 2133.
Laufer, B. & Nation, I. S. P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics 16, 307322.
Leki, I., Cumming, A. & Silva, T. (2008). A synthesis of research on second language writing in English. New York: Routledge.
Lu, X. (2011). A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers’ language development. TESOL Quarterly 45.1, 3662.
Lu, X. (in press). The relationship of lexical richness to the quality of ESL learners' oral narratives. The Modern Language Journal.
Matsuda, P. K. & Silva, T. J. (2005). Second language writing research: Perspective on the process of knowledge construction. Mahwah, NJ: Lawrence Erlbaum.
McCarthy, P. M. & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: a validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods 42, 381392.
McCarthy, P. M., Watanabe, S. & Lamkin, T. A. (2012). The Gramulator: A tool to identify differential linguistic features of correlative text types. In McCarthy, P. M. & Boonthum, C. (eds.), Applied natural language processing and content analysis: Identification, investigation, and resolution. Hershey, PA: IGI Global, 312333.
McCutchen, D. (1986). Domain knowledge and linguistic knowledge in the development of writing ability. Journal of Memory and Language 25, 431444.
McNamara, D. S. & Graesser, A. C. (2012). Coh-Metrix. In McCarthy, P. M. & Boonthum, C. (eds.), Applied natural language processing and content analysis: Identification, investigation, and resolution. Hershey, PA: IGI Global, 188205.
Pennebaker, J. W., Francis, M. E. & Booth, R. J. (2001). Linguistic Inquiry and Word Count (LIWC): LIWC2001. Mahwah, NJ: Lawrence Erlbaum.
Porte, G. K. (2012) Replication in applied linguistics research. Cambridge: Cambridge University Press.
Porte, G. K. & Richards, K. (2012). Replication in quantitative and qualitative research. Journal of Second Language Writing 21.3, 284293.
Rayner, K. & Pollatsek, A. (1994). The psychology of reading. Englewood Cliffs, NJ: Prentice Hall.
Reid, J. (1990). Responding to different topic types: A quantitative analysis from a contrastive rhetoric perspective. In Kroll, B. (ed.), Second language writing: Research insights for the classroom. Cambridge: Cambridge University Press, 191210.
Reid, J. R. (1992). A computer text analysis of four cohesion devices in English discourse by native and nonnative writers. Journal of Second Language Writing 1.2, 79107.
Silva, T. (1993). Toward an understanding of the distinct nature of L2 writing: The ESL research and its implications. TESOL Quarterly 27.4, 657675.
Witten, I. A., Frank, E. & Hall, M. A. (2011). Data mining: Practical machine learning tools and techniques. San Francisco, CA: Elsevier.
Xue, G. & Nation, I. S. P. (1984). A university word list. Language Learning and Communication 3.2, 215229.
Zwaan, R. A., Langston, M. C. & Graesser, A. C. (1995). The construction of situation models in narrative comprehension: An event-indexing model. Psychological Science 6, 292297.

Advancing research in second language writing through computational tools and machine learning techniques: A research agenda

  • Scott A. Crossley (a1)

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.