Hostname: page-component-745bb68f8f-mzp66 Total loading time: 0 Render date: 2025-01-22T07:58:46.129Z Has data issue: false hasContentIssue false

Three Psychometric-Model-Based Option-Scored Multiple Choice Item Design Principles that Enhance Instruction by Improving Quiz Diagnostic Classification of Knowledge Attributes

Published online by Cambridge University Press:  01 January 2025

William Stout*
Affiliation:
University of Illinois at Urbana-Champaign: (Statistics: Emeritus) University of Illinois Chicago (Learning Sciences Research Institute: Emeritus)
Robert Henson
Affiliation:
University of North Carolina Greensboro (Education)
Lou DiBello
Affiliation:
University of Illinois Chicago (Learning Sciences Research Institute: Emeritus)
*
Correspondence should be made to William Stout, University of Illinois atUrbana-Champaign: (Statistics: Emeritus), Champaign, USA. Email: w-stout1@illinois.edu

Abstract

Three IRT diagnostic-classification-modeling (DCM)-based multiple choice (MC) item design principles are stated that improve classroom quiz student diagnostic classification. Using proven-optimal maximum likelihood-based student classification, example items demonstrate that adherence to these item design principles increases attribute (skills and especially misconceptions) correct classification rates (CCRs). Simple formulas compute these needed item CCRs. By use of these psychometrically driven item design principles, hopefully enough attributes can be accurately diagnosed by necessarily short MC-item-based quizzes to be widely instructionally useful. These results should then stimulate increased use of well-designed MC item quizzes that target accurately diagnosing skills/misconceptions, thereby enhancing classroom learning.

Type
Theory and Methods
Copyright
Copyright © 2022 The Author(s) under exclusive licence to The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s11336-022-09885-3.

Lou DiBello is deceased, but contributed substantially to this paper and seminally to the work that made it possible.

References

References

Bock, D. (1972). Estimating item parameters and latent ability when responses are scored in two or responses are scored in two or more categories. Psychometrika, 37, 2951.CrossRefGoogle Scholar
Bradshaw, L., & Templin, J. (2014). Combining item response theory and diagnostic classification: A psychometric model for scaling ability and diagnosing misconceptions. Psychometrika, 79, 403425.CrossRefGoogle ScholarPubMed
Chiu, C., & Douglas, J. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification, 30, 225250.CrossRefGoogle Scholar
de la Torre. (2009). A cognitive diagnosis model for cognitively-based multiple choice options. Applied Psychological Measurement, 33, 163183.CrossRefGoogle Scholar
DiBello, L., Henson, R., & Stout, W. (2015). A family of generalized diagnostic classification models for multiple choice option-based scoring. Applied Psychological Measurement, 39, 6279.CrossRefGoogle ScholarPubMed
DiBello, L., Roussos, L., & Stout, W. (2007). The fusion model skills diagnostic system. In Leighton, J. P. & Gierl, M. J. (Eds.), Cognitive diagnostic assessment for education. Cambridge University Press.Google Scholar
DiBello, L., Stout, W., & Roussos, L. (1995). Unified cognitive/psychometric diagnostic assessment likelihood-based classification techniques. In Nichols, P., Chipman, S., & Brennan, R. (Eds.), Cognitively diagnostic assessment. Erlbaum.Google Scholar
Fu, Y., Henson, R., Sessoms, J., Naumenko, O., & Stout, W. (2019). A comparison of the Polytomous ERUM to the dichotomous RUM. (under review).Google Scholar
Fuchs, T., Bonney, K., & Arsenault, M. (2021). Leveraging student misconceptions to improve teaching of biochemistry & cell biology. The American Biology Teacher, 83, 511.CrossRefGoogle Scholar
Guo, W., Roussos, L., Stout, W., Xi, Wang, X., & Cai, L. (2021). Applications of diagnostic classification models to diagnosing misconceptions with constructed response items. In National Council of Measurement in Education 2022 annual meeting, San Diego, CA.Google Scholar
Gurel, D., & Eryilmaz, A. (2015). A review and comparison of diagnostic instruments to identify students’ misconceptions in science. Eurasia Journal of Mathematics, Science, & Technology Education, 11, 9891008.Google Scholar
Haertel, E.H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 301321.CrossRefGoogle Scholar
Henson, R., DiBello, L., & Stout, W. (2018). A generalized approach to defining item discrimination for DCMs. Measurement: Interdisciplinary Research Perspective, 16, 1829.Google Scholar
Henson, R. & Stout, W. (2021). GDCM-MC project website. https://sites.google.com/uncg.edu/gdcm/home Google Scholar
Junker, B., & Sitjsma, K. (2001). Cognitive assessment models with few assumptions and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258272.CrossRefGoogle Scholar
Kohn, H.-F., & Chiu, C. (2016). A proof of the duality of the DINA model and the DINO model. Journal of Classification, 33, 171184.CrossRefGoogle Scholar
Robinson, J. (2019). Designing and using multiple choice questions in computing. National Centre for Computing Education. https://teachcomputing.org/?_ga=2.32226622.1651801658.16282027612044636023.1628008966 Google Scholar
Rogers, P., & Zoumboulis, S. (2015). Using multiple choice questions to identify and address misconceptions in the mathematics classroom. Paper presented at The 25th biennial conference of the Australian Association of Mathematics Teachers. University of South Australia, Adelaide Australia. https://doi.org/10.13140/RG.2.1.1564.5925 CrossRefGoogle Scholar
Sessoms, J., Fu, Y., Henson, R., & Stout, W. (2019). The optimal number of response options for multiple choice items: A simulation study using option-based diagnostic classification models (under review).Google Scholar
Shin, J., Guo, Q., & Gierl, M. (2019). Multiple–choice item distractor development using topic modeling approaches. Frontiers in Psychology|Educational Psychology, 10.CrossRefGoogle Scholar
Stout, W., Henson, R., & DiBello, L. (2022). Optimal classification methods for diagnosing latent skills and misconceptions for option scored multiple choice Item quizzes. Behaviormetrika (to appear).Google Scholar
Stout, W., Henson, R., DiBello, L., & Shear, B. (2019). The reparameterized Unified Model system: A diagnostic assessment modeling approach. In M. von Davier & Y.-S. Lee (Eds.), Handbook of diagnostic classification models. Springer.CrossRefGoogle Scholar
Strachan, T., Stout, W., & Henson, R. (2021). User manual for shiny in R based GDCM-MC statistical analysis software. https://sites.google.com/uncg.edu/gdcm/home Google Scholar
Tatsuoka, K. (1994). Architecture of knowledge structures and cognitive diagnosis: A pattern recognition and classification approach. In Nichols, P., Chipman, S., & Brennan, R. (Eds.), Cognitively diagnostic assessment. Lawrence Erlbaum.Google Scholar
Thissen-Roe, A., Hunt, E., & Minstrell, J. (2004). The DIAGNOSER project: Combining assessment and learning. Behavior Research Methods, Innovations, & Computers, 36, 234240.CrossRefGoogle ScholarPubMed
Treagust, D. (1986). Evaluating students’ misconceptions by means of diagnostic multiple choice items. Research in Science Education, 16, 199207.CrossRefGoogle Scholar
von Davier, M., & Lee, Y.-S. (2019). Handbook of diagnostic classification models. Springer, Cham.CrossRefGoogle Scholar
Yale Poorvu Center for Teaching and Learning (n.d.). Designing quality multiple choice items. https://poorvucenter.yale.edu/MultipleChoiceQuestions Google Scholar
Bock, D. (1972). Estimating item parameters and latent ability when responses are scored in two or responses are scored in two or more categories. Psychometrika, 37, 2951.CrossRefGoogle Scholar
Bradshaw, L., & Templin, J. (2014). Combining item response theory and diagnostic classification: A psychometric model for scaling ability and diagnosing misconceptions. Psychometrika, 79, 403425.CrossRefGoogle ScholarPubMed
Chiu, C., & Douglas, J. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification, 30, 225250.CrossRefGoogle Scholar
de la Torre. (2009). A cognitive diagnosis model for cognitively-based multiple choice options. Applied Psychological Measurement, 33, 163183.CrossRefGoogle Scholar
DiBello, L., Henson, R., & Stout, W. (2015). A family of generalized diagnostic classification models for multiple choice option-based scoring. Applied Psychological Measurement, 39, 6279.CrossRefGoogle ScholarPubMed
DiBello, L., Roussos, L., & Stout, W. (2007). The fusion model skills diagnostic system. In Leighton, J. P. & Gierl, M. J. (Eds.), Cognitive diagnostic assessment for education. Cambridge University Press.Google Scholar
DiBello, L., Stout, W., & Roussos, L. (1995). Unified cognitive/psychometric diagnostic assessment likelihood-based classification techniques. In Nichols, P., Chipman, S., & Brennan, R. (Eds.), Cognitively diagnostic assessment. Erlbaum.Google Scholar
Fu, Y., Henson, R., Sessoms, J., Naumenko, O., & Stout, W. (2019). A comparison of the Polytomous ERUM to the dichotomous RUM. (under review).Google Scholar
Fuchs, T., Bonney, K., & Arsenault, M. (2021). Leveraging student misconceptions to improve teaching of biochemistry & cell biology. The American Biology Teacher, 83, 511.CrossRefGoogle Scholar
Guo, W., Roussos, L., Stout, W., Xi, Wang, X., & Cai, L. (2021). Applications of diagnostic classification models to diagnosing misconceptions with constructed response items. In National Council of Measurement in Education 2022 annual meeting, San Diego, CA.Google Scholar
Gurel, D., & Eryilmaz, A. (2015). A review and comparison of diagnostic instruments to identify students’ misconceptions in science. Eurasia Journal of Mathematics, Science, & Technology Education, 11, 9891008.Google Scholar
Haertel, E.H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 301321.CrossRefGoogle Scholar
Henson, R., DiBello, L., & Stout, W. (2018). A generalized approach to defining item discrimination for DCMs. Measurement: Interdisciplinary Research Perspective, 16, 1829.Google Scholar
Henson, R. & Stout, W. (2021). GDCM-MC project website. https://sites.google.com/uncg.edu/gdcm/home Google Scholar
Junker, B., & Sitjsma, K. (2001). Cognitive assessment models with few assumptions and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258272.CrossRefGoogle Scholar
Kohn, H.-F., & Chiu, C. (2016). A proof of the duality of the DINA model and the DINO model. Journal of Classification, 33, 171184.CrossRefGoogle Scholar
Robinson, J. (2019). Designing and using multiple choice questions in computing. National Centre for Computing Education. https://teachcomputing.org/?_ga=2.32226622.1651801658.16282027612044636023.1628008966 Google Scholar
Rogers, P., & Zoumboulis, S. (2015). Using multiple choice questions to identify and address misconceptions in the mathematics classroom. Paper presented at The 25th biennial conference of the Australian Association of Mathematics Teachers. University of South Australia, Adelaide Australia. https://doi.org/10.13140/RG.2.1.1564.5925 CrossRefGoogle Scholar
Sessoms, J., Fu, Y., Henson, R., & Stout, W. (2019). The optimal number of response options for multiple choice items: A simulation study using option-based diagnostic classification models (under review).Google Scholar
Shin, J., Guo, Q., & Gierl, M. (2019). Multiple–choice item distractor development using topic modeling approaches. Frontiers in Psychology|Educational Psychology, 10.CrossRefGoogle Scholar
Stout, W., Henson, R., & DiBello, L. (2022). Optimal classification methods for diagnosing latent skills and misconceptions for option scored multiple choice Item quizzes. Behaviormetrika (to appear).Google Scholar
Stout, W., Henson, R., DiBello, L., & Shear, B. (2019). The reparameterized Unified Model system: A diagnostic assessment modeling approach. In M. von Davier & Y.-S. Lee (Eds.), Handbook of diagnostic classification models. Springer.CrossRefGoogle Scholar
Strachan, T., Stout, W., & Henson, R. (2021). User manual for shiny in R based GDCM-MC statistical analysis software. https://sites.google.com/uncg.edu/gdcm/home Google Scholar
Tatsuoka, K. (1994). Architecture of knowledge structures and cognitive diagnosis: A pattern recognition and classification approach. In Nichols, P., Chipman, S., & Brennan, R. (Eds.), Cognitively diagnostic assessment. Lawrence Erlbaum.Google Scholar
Thissen-Roe, A., Hunt, E., & Minstrell, J. (2004). The DIAGNOSER project: Combining assessment and learning. Behavior Research Methods, Innovations, & Computers, 36, 234240.CrossRefGoogle ScholarPubMed
Treagust, D. (1986). Evaluating students’ misconceptions by means of diagnostic multiple choice items. Research in Science Education, 16, 199207.CrossRefGoogle Scholar
von Davier, M., & Lee, Y.-S. (2019). Handbook of diagnostic classification models. Springer, Cham.CrossRefGoogle Scholar
Yale Poorvu Center for Teaching and Learning (n.d.). Designing quality multiple choice items. https://poorvucenter.yale.edu/MultipleChoiceQuestions Google Scholar
Supplementary material: File

Stout et al. supplementary material

Electronic Supplemental Material
Download Stout et al. supplementary material(File)
File 17.1 KB