Skip to main content Accessibility help
Hostname: page-component-684899dbb8-vtfg7 Total loading time: 0.659 Render date: 2022-05-25T10:43:04.974Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "useRatesEcommerce": false, "useNewApi": true }

2 - Understanding Argument-Based Validity in Language Testing

from Part I - Basic Concepts and Uses of Validity Argument in Language Testing and Assessment

Published online by Cambridge University Press:  14 January 2021

Carol A. Chapelle
Iowa State University
Erik Voss
Teachers College, Columbia University
Get access


Argument-based validity has evolved in response to the needs of language testing researchers for a systematic approach to investigating validity of the language tests. Based on a collection of 51 recent books, articles, and research reports in language assessment, this chapter describes the fundamental characteristics of an argument-based approach to validity, which has been operationalized in various ways in language assessment. These characteristics demonstrate how argument-based validity operationalizes the ideals for validation presented by Messick (1989) and accepted by most language testers: that a validity argument should be unitary, but multifacted means for integrating a variety of evidence in an ongoing validation process. The chapter describes how validity arguments serve the multiple functions that language testers demand of their validation tools, and takes into account the concepts that are important in language testing. It distinguishes between two formulations of argument-based validity that appear in language testing to introduce the conventions used throughout the papers in the volume.

Validity Argument in Language Testing
Case Studies of Validation Research
, pp. 19 - 44
Publisher: Cambridge University Press
Print publication year: 2021

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Aryadoust, V. (2011). Validity arguments of the speaking and listening modules of international English language testing system: A synthesis of existing research. Asian ESP Journal, 7(2), 2854.Google Scholar
Aryadoust, V. (2013). Building a validity argument for a listening test of academic proficiency. Newcastle upon Tyne: Cambridge Scholars Publishing.Google Scholar
Bachman, L. F. (2005). Building and supporting a case for test use. Language Assessment Quarterly, 2(1), 134.CrossRefGoogle Scholar
Bachman, L. F., & Palmer, A. (1996). Language testing in practice. Oxford: Oxford University Press.Google Scholar
Bachman, L. F., & Palmer, A. (2010). Language assessment in practice. Oxford: Oxford University Press.Google Scholar
Barkaoui, K. (2017). Examining repeaters’ performance on second language proficiency tests: A review and a call for research. Language Assessment Quarterly, 14(4), 420431.CrossRefGoogle Scholar
Brooks, L., & Swain, M. (2014). Contextualizing performances: Comparing performances during TOEFL iBTTM and real-life academic speaking activities. Language Assessment Quarterly, 11(4), 353373.CrossRefGoogle Scholar
Carroll, P. E., & Bailey, A. L. (2016). Do decision rules matter? A descriptive study of English language proficiency assessment classifications for English-language learners and native English speakers in fifth grade. Language Testing, 33(1), 2352.CrossRefGoogle Scholar
Chapelle, C. A. (1998). Construct definition and validity inquiry in SLA research. In Bachman, L. F. & Cohen, A. D. (Eds.), Second language acquisition and language testing interfaces (pp. 3270). Cambridge: Cambridge University Press.Google Scholar
Chapelle, C. A. (1999). Validity in language assessment. Annual Review of Applied Linguistics, 19, 254272.CrossRefGoogle Scholar
Chapelle, C. A. (2012). Validity argument for language assessment: The framework is simple… Language Testing, 29(1), 1927.CrossRefGoogle Scholar
Chapelle, C. A., Chung, Y.-R., Hegelheimer, V., Pendar, N., & Xu, J. (2010). Towards a computer-delivered test of productive grammatical ability. Language Testing, 27(4), 443469.CrossRefGoogle Scholar
Chapelle, C. A., Cotos, E., & Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing, 32(3), 385405.CrossRefGoogle Scholar
Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (2008). Building a validity argument for the Test of English as a Foreign LanguageTM. New York: Routledge.Google Scholar
Chapelle, C. A., & Voss, E. (2013). Evaluation of language tests through validation research. In Kunnan, A. J. (Ed.), The companion to language assessment (pp. 10791097). Chichester: Wiley.CrossRefGoogle Scholar
Cheng, L., & Sun, Y. (2015). Interpreting the impact of the Ontario Secondary School Literacy Test on second language students within an argument-based validation framework. Language Assessment Quarterly, 12(1), 5066.CrossRefGoogle Scholar
Choi, Y. (2018). Graphic-prompt tasks for assessment of academic English writing ability: An argument-based approach to investigating validity. Unpublished doctoral dissertation, Iowa State University.Google Scholar
Chung, Y.-R. (2014). A test of productive English grammatical ability in academic writing: Development and validation. Unpublished doctoral dissertation, Iowa State University.Google Scholar
Colby-Kelly, C., & Turner, C. (2007). AFL research in the L2 classroom and evidence of usefulness: Taking formative assessment to the next level. Canadian Modern Language Review, 64(1), 937.CrossRefGoogle Scholar
Creswell, J., & Plano Clark, V. (2017). Designing and conducting mixed methods research (3rd ed.). Thousand Oaks, CA: Sage Publications.Google Scholar
Cronbach, L. J. (1971). Test validation. In Thorndike, R. L. (Ed.), Educational measurement (pp. 443507). Washington, DC: American Council on Education.Google Scholar
Cronbach, L. J. (1988). Internal consistency of tests: Analyses old and new. Psychometrika, 53(1), 6370.CrossRefGoogle Scholar
Doe, C. D. (2013). Validating the Canadian academic English language assessment for diagnostic purposes from three perspectives: Scoring, teaching, and learning. Unpublished doctoral dissertation, Queen’s University.Google Scholar
Doe, C. D. (2015). Student interpretations of diagnostic feedback. Language Assessment Quarterly, 12(1), 110135.CrossRefGoogle Scholar
Educational Testing Service (ETS). (2018). Validity evidence supporting the interpretation and use of TOEFL iBT® scores. TOEFL® Research Insight Series, Volume 4. Princeton, NJ: Educational Testing Service.Google Scholar
Enright, M. K., & Quinlan, T. (2010). Complementing human judgment of essays written by English language learners with e-rater® scoring. Language Testing, 27(3), 317334.CrossRefGoogle Scholar
Farnsworth, T. L. (2013). An investigation into the validity of the TOEFL iBT Speaking test for international teaching assistant certification. Language Assessment Quarterly, 10(3), 274291.CrossRefGoogle Scholar
Frost, K., Elder, C., & Wigglesworth, G. (2012). Investigating the validity of an integrated listening-speaking task: A discourse-based analysis of test takers’ oral performances. Language Testing, 29(3), 345369.CrossRefGoogle Scholar
Fulcher, G., & Davidson, F. (2009). Test architecture, test retrofit. Language Testing, 26(1), 123144.CrossRefGoogle Scholar
He, L., & Min, S. (2017). Development and validation of a computer adaptive EFL test. Language Assessment Quarterly, 14(2), 160176.CrossRefGoogle Scholar
Im, G.-H., & Cheng, L. (2019). The Test of English for International Communication (TOEIC®). Language Testing, 36(2), 315324.CrossRefGoogle Scholar
Jia, Y. (2013). Justifying the use of a second language oral test as an exit test in Hong Kong: An application of assessment use argument framework. Unpublished doctoral dissertation, University of California, Los Angeles.Google Scholar
Johnson, R. C. (2011). Assessing the assessments: Using an argument-based validity framework to assess the validity and use of an English placement system in a foreign language context. Unpublished doctoral dissertation, Macquarie University.Google Scholar
Jun, H. S. (2014). A validity argument for the use of scores from a web-search-permitted and web-source-based integrated writing test. Unpublished doctoral dissertation, Iowa State University.Google Scholar
Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527535.CrossRefGoogle Scholar
Kane, M. T. (2001). Current concerns in validity theory. Journal of Educational Measurement, 38, 319342.CrossRefGoogle Scholar
Kane, M. T. (2006). Validation. In Brennen, R. (Ed.), Educational measurement (4th ed., pp. 1764). Westport, CT: Greenwood Publishing.Google Scholar
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 173.CrossRefGoogle Scholar
Kenyon, D. (2012). Using Bachman’s assessment use argument as a tool in conceptualizing the issues surrounding linking ACTFL and CERF. In Tschirner, E. (Ed.), Aligning frameworks of reference in language testing: The ACTFL Proficiency Guidelines and the Common European Framework of Reference for Languages (pp. 2334). Tübingen, Germany: Stauffenburg Verlag.Google Scholar
Kim, Y.-H. (2010). An argument-based validity inquiry into the Empirically-derived Descriptor-based Diagnostic (EDD) assessment in ESL academic writing. Unpublished doctoral dissertation, University of Toronto.Google Scholar
Klebanov, B. B., Ramineni, C., Kaufer, D., Yeoh, P., & Ishizaki, S. (2019). Advancing the validity argument for standardized writing tests using quantitative rhetorical analysis. Language Testing, 36(1), 125144.CrossRefGoogle Scholar
Koizumi, R., Sakai, H., Ido, T., Ota, H., Hayama, M., Sato, M., & Nemoto, A. (2011). Toward validity argument for test interpretation and use based on scores of a diagnostic grammar test for Japanese learners of English. Japanese Journal for Research on Testing (『日本テスト学会誌』), 7(1), 99119.Google Scholar
LaFlair, G. T., & Staples, S. (2017). Using corpus linguistics to examine the extrapolation inference in the validity argument for a high-stakes speaking assessment. Language Testing, 34(4), 451475.CrossRefGoogle Scholar
Lee, J. (2016). Transfer from ESL academic writing to first year composition and other disciplinary courses: An assessment perspective. Unpublished doctoral dissertation, Iowa State University.Google Scholar
Li, Z. (2015). An argument-based validation study of the English Placement Test (EPT): Focusing on the inferences of extrapolation and ramification. Unpublished doctoral dissertation, Iowa State University.Google Scholar
Llosa, L. (2005). Building and supporting a validity argument for a standards-based classroom assessment of English proficiency. Unpublished doctoral dissertation, University of California, Los Angeles.Google Scholar
Llosa, L. (2008). Building and supporting a validity argument for a standards-based classroom assessment of English proficiency based on teacher judgments. Educational Measurement: Issues and Practice, 27(3), 3242.CrossRefGoogle Scholar
Llosa, L., & Malone, M. E. (2019). Comparability of students’ writing performance on TOEFL iBT and in required university writing courses. Language Testing, 36(2), 235263.CrossRefGoogle Scholar
McNamara, T., & Roever, C. (2006). Language testing: The social dimension. Oxford: Blackwell Publishing.Google Scholar
Messick, S. (1989). Validity. In Linn, R. L. (Ed.), Educational measurement (3rd ed., pp. 13103). New York: Macmillan Publishing Co.Google Scholar
Mislevy, R. J., & Haertel, G. D. (2006). Implications of evidence-centered design for educational testing. Educational Measurement: Issues and Practice, Winter, 6–20.CrossRefGoogle Scholar
Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 362.Google Scholar
Norris, J. M. (2008). Validity evaluation in language assessment. New York: Peter Lang.CrossRefGoogle Scholar
Pan, M., & Qian, D. D. (2017). Embedding corpora into the content validation of the grammar test of the National Matriculation English Test (NMET) in China. Language Assessment Quarterly, 14(2), 120139.CrossRefGoogle Scholar
Papageorgiou, S., & Tannenbaum, R. J. (2016). Situating standard setting within argument-based validity. Language Assessment Quarterly, 13(2), 109123.CrossRefGoogle Scholar
Pardo-Ballester, C. (2010). The validity argument of a web-based Spanish listening exam: Test usefulness evaluation. Language Assessment Quarterly, 7(2), 137159.CrossRefGoogle Scholar
Park, M. (2015). Development and validation of virtual interactive tasks for an aviation English assessment. Unpublished doctoral dissertation, Iowa State University.Google Scholar
Plakans, L., & Burke, M. (2013). The decision-making process in language program placement: Test and nontest factors interacting in context. Language Assessment Quarterly, 10(2), 115134.CrossRefGoogle Scholar
Roever, C. (2011). Testing of second language pragmatics: Past and future. Language Testing, 28(4), 463481.CrossRefGoogle Scholar
Sawaki, Y., & Sinharay, S. (2018). Do the TOEFL iBT® section scores provide value-added information to stakeholders? Language Testing, 35(4), 529556.CrossRefGoogle Scholar
Schimidgall, J. E., Getman, E. P., & Zu, J. (2018). Screener tests need validation too: Weighing an argument for test use against practical concerns. Language Testing, 35(4), 583607.CrossRefGoogle Scholar
Schmidgall, J. E, & Xi, X. (2020). Validation of language assessments. In Chapelle, C. A. (Ed.), Concise encyclopedia of applied linguistics (pp. 11231135). Oxford: Wiley-Blackwell.Google Scholar
So, Y. (2014). Are teacher perspectives useful? Incorporating EFL teacher feedback in the development of a large-scale international English test. Language Assessment Quarterly, 11(3), 283303.CrossRefGoogle Scholar
Suzuki, Y. (2015). Self-assessment of Japanese as a second language: The role of experiences in the naturalistic acquisition. Language Testing, 32(1), 6381.CrossRefGoogle Scholar
Toulmin, S. E. (2003). The uses of argument. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Vongpumivitch, V. (2010). The General English Proficiency Test. In Cheng, L. & Curtis, A. (Eds.), English language assessment and the Chinese learner (pp. 158172). New York: Routledge.Google Scholar
Voss, E. (2012). A validity argument for score meaning of a computer-based ESL academic collocational ability test based on a corpus-driven approach to test design. Unpublished doctoral dissertation, Iowa State University.Google Scholar
Wang, H., Choi, I., Schmidgall, J., & Bachman, L. F. (2012). Review of Pearson Test of English Academic. Language Testing, 29(4), 603619.CrossRefGoogle Scholar
Weigle, S. C., Yang, W., & Montee, M. (2013). Exploring reading processes in an academic reading test using short-answer questions. Language Assessment Quarterly, 10(1), 2848.CrossRefGoogle Scholar
Weir, C. J. (2005). Language testing and validation: An evidence-based approach. Basingstoke: Palgrave Macmillan.CrossRefGoogle Scholar
Xi, X. (2008). Methods of test validation. In Shohamy, E. & Hornberger, N. H. (Eds.), Encyclopedia of language and education, 2nd edition, Volume 7: Language testing and assessment (pp. 177196). New York: Springer.Google Scholar
Xi, X. (2010). How do we go about investigating test fairness? Language Testing, 27(2), 147170.Google Scholar
Yang, H. (2016). Integration of a web-based rating system with an oral proficiency interview test: Argument-based approach to validation. Unpublished doctoral dissertation, Iowa State University.Google Scholar
Youn, S. J. (2015). Validity argument for assessing L2 pragmatics in interaction using mixed methods. Language Testing, 32(2), 199225.CrossRefGoogle Scholar
Cited by

Save book to Kindle

To save this book to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the or variations. ‘’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats