Fitting Rasch Model using Appropriateness Measure Statistics

José Antonio López Pina; M. Dolores Hidalgo Montesinos

doi:10.1017/S113874160000500X

Fitting Rasch Model using Appropriateness Measure Statistics

Published online by Cambridge University Press: 10 April 2014

José Antonio López Pina and

M. Dolores Hidalgo Montesinos

Show author details

José Antonio López Pina*: Affiliation:
University of Murcia
M. Dolores Hidalgo Montesinos: Affiliation:
University of Murcia
*: Correspondence should be addressed to: José A. López Pina, Depto. de Psicología Básica y Metodología, Facultad de Psicología, Campus de Espinardo, 30100-Murcia(Spain). Phone: 968-363478. Fax: 968-364115. E-mail: jlpina@um.es

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In this paper, the distributional properties and power rates of the Lz, Eci2z, and Eci4z statistics when they are used as item fit statistics were explored. The results were compared to t-transformation of Outfit and Infit mean square. Four sample sizes were selected: 100, 250, 500, and 1000 examinees. The abilities were uniform and normal with mean 0 and standard deviation 1, and uniform and normal with mean –1 and standard deviation 1. The pseudo-guessing parameter was fixed at .25. Two ranges of difficulty parameters were selected: ±1 logits and ±2 logits. Two test lengths were selected: 15 and 30 items. The results showed important differences between the T-infit, T-outfit, Lz, Eci2z, and Eci4z statistics. The T-oufit, T-infit, and Lz statistics showed poor standardization with estimated parameters because their distributional properties were not close to the expected values. However, the Eci2z and Eci4z statistics showed satisfactory standardization on all conditions. Further, the power rates of Eci2z and Eci4z were 5% to 10% higher than the power rates of Lz, T-outfit, and T-infit to detect items that do not fit Rasch model.

El objetivo de este trabajo fue estudiar la potencia y propiedades distribucionales de tres estadísticos de medida de la adecuación cuando se utilizan como estadísticos de ajuste de los ítems. Los estadísticos sometidos a comparación fueron: Lz, Eci2z y Eci4z. Los resultados obtenidos se compararon con los estadísticos T-outfit y T-infit. Se seleccionaron cuatro tamaños muestrales: 100, 250, 500 y 1000 sujetos. Se sometieron a estudio distintas distribuciones de habilidad: uniforme y normal, con media 0 y desviación típica 1, y uniforme y normal con media –1 y desviación típica 1. El parámetro de pseudo-azar fue fijado en .25. Para los parámetros de dificultad se utilizaron dos distribuciones uniformes de ±1 logits y ±2 logits. Por ultimo, se consideraron dos longitudes de tests: 15 y 30 ítems. Los resultados mostraron que los estadísticos Lz, T-outfit y T-infit no tienden a los valores esperados cuando se calculan con parámetros estimados, mientras que los estadísticos Eci2z y Eci4z mantuvieron mejor las propiedades de sus distribuciones teóricas. Además, la potencia de estos dos últimos estadísticos para detectar ítems no ajustados al modelo de Rasch estuvo entre un 5% y un 10% más que la potencia de los estadísticos Lz, T-outfit y T-infit.

Keywords

Rasch model item response theory appropriateness measure item fit statistics modelo de Rasch teoría de la respuesta al ítem medida de la adecuación estadísticos de ajuste de ítems

Type: Articles
Information: The Spanish Journal of Psychology , Volume 8 , Issue 1 , May 2005 , pp. 100 - 110

DOI: https://doi.org/10.1017/S113874160000500X [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2005

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Birenbaum, M. (1986). Effect of dissimulation motivation and anxiety on response pattern appropriateness measures. Applied Psychological Measurement, 10, 167–174.CrossRef Google Scholar

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability. In Lord, F. M. & Novick, M. R. (Eds.), Statistical theories of mental test scores (pp. 397–472). Reading, MA: Addison-Wesley.Google Scholar

Bond, T. G., & Fox, C. M. (2001). Applying the Rasch model: Fundamental measurement in the human sciences. Mahwah, NJ: Erlbaum.CrossRef Google Scholar

Drasgow, F., & Levine, M. V. (1986). Optimal detection of certain forms of inappropriate test scores. Applied Psychological Measurement, 10, 59–67.CrossRef Google Scholar

Drasgow, F., Levine, M. V., & Mclaughlin, M. E. (1987). Detecting inappropriate test scores with optimal and practical appropriateness indices. Applied Psychological Measurement, 11, 59–79.CrossRef Google Scholar

Fisher, G. H., & Molenaar, I. W. (Eds.) (1995). Rasch models: Foundations, recent developments, and applications. New York: Springer-Verlag.CrossRef Google Scholar

Gulliksen, H. (1950). Theory of mental test. New York: Wiley.CrossRef Google Scholar

Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer-Nijhoff.CrossRef Google Scholar

Harnish, D. L., & Tatsuoka, K. K. (1983). A comparison of appropriateness indices based on item response theory. In Hambleton, R. K. (Ed.), Applications of item response theory (pp. 104–122). Vancouver, Canada: Educational Research Institute of British Columbia.Google Scholar

Hulin, Ch. L., Drasgow, F., & Parsons, Ch. K. (1983). Item response theory: Application to psychological measurement. Homewood, IL: Dow-Jones Irwin.Google Scholar

Levine, M. V., & Rubin, D. F. (1979). Measuring the appropriateness of multiple-choice test scores. Journal of Educational Statistics, 4, 269–290.CrossRef Google Scholar

Li, M. F., & Olejnik, S. (1997). The power of Rasch person-fit statistics in detecting unusual response patterns. Applied Psychological Measurement, 21, 215–231.CrossRef Google Scholar

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.Google Scholar

Lord, F. M. (1983). Small N justifies Rasch model. In Weiss, D. J. (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 51–61). New York: Academic Press.Google Scholar

Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading MA: Addison-Wesley.Google Scholar

Meijer, R. R. (1996). The influence of the presence of deviant item score patterns on the power of a person-fit statistic. Applied Psychological Measurement, 20, 141–154.CrossRef Google Scholar

Meijer, R. R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107–135.CrossRef Google Scholar

Molenaar, I. W., & Hoijtink, H. (1990). The many null distributions of person-fit indices. Psychometrika, 55, 75–106.CrossRef Google Scholar

Molenaar, I. W., & Hoijtink, H. (1996). Person-fit and the Rasch model, with an application to knowledge of logical quantors. Applied Measurement in Education, 9, 27–45.CrossRef Google Scholar

Nering, M. L. (1995). The distribution of person fit using true and estimated person parameters. Applied Psychological Measurement, 19, 121–129.CrossRef Google Scholar

Nering, M. L. (1997). The distribution of indexes of person-fit within the computerized adaptive testing environment. Applied Psychological Measurement, 21, 115–127.CrossRef Google Scholar

Noonan, B. W., Boss, M. W., & Gessaroli, M. E. (1992). The effect of test length and IRT model on the distribution and stability of three appropriateness indexes. Applied Psychological Measurement, 16, 345–352.CrossRef Google Scholar

Rasch, G. (1960). Probabilistic models for some intelligence and attainment test. Copenhagen: The Danish Institute of Educational Research. (Expanded edition, 1980. Chicago: The University Chicago Press.)Google Scholar

Reise, S. P. (1990). A comparison of item-and person-fit methods of assessing model-data fit in IRT. Applied Psychological Measurement, 14, 127–137.CrossRef Google Scholar

Reise, S. P. (1995). Scoring method and the detection of person misfit in a personality assessment context. Applied Psychological Measurement, 19, 213–229.CrossRef Google Scholar

Rogers, H. J., & Hattie, J. A. (1987). A Monte Carlo investigation of several person and item fit statistics for item response models. Applied Psychological Measurement, 11, 47–57.CrossRef Google Scholar

Smith, R. M. (1991). The distributional properties of Rasch item-fit statistics. Educational and Psychological Measurement, 51, 541–565.CrossRef Google Scholar

Smith, R. M., Schumacker, R. E., & Bush, M. J. (1998). Using item mean squares to evaluate fit to the Rasch model. Journal of Outcome Measurement, 2, 66–78.Google Scholar

Stout, W. (1987). A nonparametric approach for assessing latent trait unidimensionality. Psychometrika, 52, 589–617.CrossRef Google Scholar

SYSTAT (v. 10.0) (2000). The system for statistics. SPSS Inc.Google Scholar

Tatsuoka, K. K. (1984). Caution indices based on item response theory. Psychometrika, 49, 95–110.CrossRef Google Scholar

Tatsuoka, K. K., & Linn, R. L. (1983). Indices for detecting unusual response patterns: Links between two general approaches and potential applications. Applied Psychological Measurement, 7, 81–96.CrossRef Google Scholar

Van der Linden, W. J., & Hambleton, R. K. (Eds.). (1997). Handbook of modern item response theory. New York: Springer-Verlag.CrossRef Google Scholar

Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: MESA Press.Google Scholar

Wright, B. D., & Stone, M. (1979). Best test design. Chicago: MESA Press.Google Scholar

Wu, M. L., Adams, R. J., & Wilson, M. R. (1998). Acer ConQuest: Generalised item response modelling software. Melbourne, Australia: Australian Council for Educational Research.Google Scholar

Article contents

Fitting Rasch Model using Appropriateness Measure Statistics

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests