Support for the Evaluation Inference

doi:10.1017/9781108669849.008

5 - Support for the Evaluation Inference

Investigating Conditions for Rating Responses on a Test of Academic Oral Language

from Part II - Investigating Score Interpretations

Published online by Cambridge University Press: 14 January 2021

Hyejin Yang

Edited by

Carol A. Chapelle and

Erik Voss

Show author details

Carol A. Chapelle: Affiliation:
Iowa State University
Erik Voss: Affiliation:
Teachers College, Columbia University

Book contents

Get access

Summary

This chapter reports on one aspect of the argument-based validation research conducted to evaluate the interpretation and use of scores from the Oral English Certification Test (OECT). The test of English speaking ability for prospective international teaching assistants (ITAs) was updated by introducing a new a web-based rating system, called Rater-Platform (R-PLAT). R-PLAT was intended to improve the efficiency of the rating process, but research was needed to investigate its effects on all aspects of the interpretation/use argument (Kane, 2013). The study investigated the warrant underlying the evaluation inference: the observed performance on the OECT recorded via R-PLAT provides observed scores and observed performance descriptors reflective of targeted speaking ability. The assumption in need of support was that the quality of rating conditions created by R-PLAT was sufficient for gathering accurate scores. Backing was found through analysis of raters’ perceptions towards and their use of R-PLAT collected through questionnaires and interviews. This chapter concludes with the validity argument showing how evidence collected from this study supported the assumptions underlying the evaluation inference. It suggests future research needed to build the complete validity argument for the OECT with R-PLAT, and potential use of a web-based rating system for other speaking tests.

Keywords

rater-platform (R-PLAT)rating process evaluation inference rater perceptions international teaching assistants (ITAs)proficiency levels

Type: Chapter
Information: Validity Argument in Language Testing
Case Studies of Validation Research
, pp. 96 - 119

DOI: https://doi.org/10.1017/9781108669849.008 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2021

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bachman, L. F. (2005). Building and supporting a case for test use. Language Assessment Quarterly, 19(4), 453–476.Google Scholar

Bachman, L. F., & Palmer, A. (2010). Language assessment in practice. Oxford: Oxford University Press.Google Scholar

Canale, M. (1986). The promise and threat of computerized adaptive assessment of reading comprehension. In Stansfield, C. (Ed.), Technology and language testing (pp. 30–45). Washington, DC: TESOL.Google Scholar

Chapelle, C. A. (2012). Conceptions of validity. In Fulcher, G. & Davidson, F. (Eds.), The Routledge handbook of language testing (pp. 21–33). New York: Routledge.Google Scholar

Chapelle, C. A., Cotos, E., & Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing, Language Testing, 33(2), 385–405.CrossRef Google Scholar

Chapelle, C. A., & Douglas, D. (2006). Assessing language through computer technology. Cambridge: Cambridge University Press.Google Scholar

Chapelle, C. A., Enright, M. K., & Jamieson, J. (Eds.). (2008). Building a validity argument for the Test of English as a Foreign Language^TM. New York: Routledge.Google Scholar

Chung, Y. (2014). A test of productive English grammatical ability in academic writing: Development and validation. Unpublished doctoral dissertation, Iowa State University, Ames, IA.Google Scholar

Cotos, E., & Chung, Y.-R. (2018). Domain description: Validating the interpretation of the TOEFL iBT^® speaking scores for international teaching assistant screening and certification purposes. TOEFL Research Report No. RR-85. Princeton, NJ: Educational Testing Service. https://doi.org/10.1002/ets2.12233 Google Scholar

Creswell, J. W., & Plano Clark, V. L. (2011). Designing and conducting mixed methods research (2nd ed.). Thousand Oaks, CA: Sage Publications.Google Scholar

Elder, C., Barkhuizen, G., Knoch, U., & Randow, J. V. (2007). Evaluating rater responses to an online training program for L2 writing assessment. Language Testing, 24(1), 37–64.Google Scholar

Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory. New York: Aldine.Google Scholar

Jun, H. (2014). A validity argument for the use of scores from a web-search-permitted and web-source-based integrated writing test. Unpublished doctoral dissertation, Iowa State University, Ames, IA.Google Scholar

Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112(3), 527–535.Google Scholar

Kane, M. T. (2001). Current concerns in validity theory. Journal of Educational Measurement, 38(4), 319–342.Google Scholar

Kane, M. T. (2004). Certification testing as an illustration of argument-based validation. Measurement: Interdisciplinary Research and Perspectives, 2(3), 135–170.Google Scholar

Kane, M. T. (2006). Validation. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 17–64). Westport, CT: American Council on Education.Google Scholar

Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73.Google Scholar

Knoch, U., & Chapelle, C. A. (2018). Validation of rating processes within an argument-based framework. Language Testing, 35(4), 477–499.Google Scholar

McNamara, T. F. (1996). Measuring second language performance. London: Longman.Google Scholar

Ockey, G. J. (2009). The effects of a test taker’s group members’ personalities on the test taker’s second language group oral discussion test scores. Language Testing, 26(2), 161–186.Google Scholar

Yang, H. (2016). Integration of a web-based rating system with an oral proficiency interview test: Argument-based approach to validation. Unpublished doctoral dissertation, Iowa State University, Ames, IA.Google Scholar

Book contents

5 - Support for the Evaluation Inference

Summary

Keywords

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive