Skip to main content Accessibility help
×
Home

Assessing Features of Psychometric Assessment Instruments: A Comparison of the COSMIN Checklist with Other Critical Appraisal Tools

  • Ulrike Rosenkoetter (a1) and Robyn L. Tate (a1)

Abstract

The past 20 years have seen the development of instruments designed to specify standards and evaluate the adequacy of published studies with respect to the quality of study design, the quality of findings, as well as the quality of their reporting. In the field of psychometrics, the first minimum set of standards for the review of psychometric instruments was published in 1996 by the Scientific Advisory Committee of the Medical Outcomes Trust. Since then, a number of tools have been developed with similar aims. The present paper reviews basic psychometric properties (reliability, validity and responsiveness), compares six tools developed for the critical appraisal of psychometric studies and provides a worked example of using the COSMIN checklist, Terwee-m statistical quality criteria, and the levels of evidence synthesis using the method of Schellingerhout and colleagues (2012). This paper will aid users and reviewers of questionnaires in the quality appraisal and selection of appropriate instruments by presenting available assessment tools, their characteristics and utility.

Copyright

Corresponding author

Address for correspondence: Professor Robyn L. Tate, John Walsh Centre for Rehabilitation Research, University of Sydney, Level 9, Kolling Institute of Medical Research, Royal North Shore Hospital, St Leonards, New South Wales 2065, Australia. E-mail: robyn.tate@sydney.edu.au

References

Hide All
American Educational Research Association. (1999). American Psychological Association, & National Council on Measurement in Education. Standards for educational and psychological testing. American Educational Research Association.
Anastasi, A., & Urbina, S. (1997). Psychology testing. New Jersey: Prentice Hall.
Andresen, E.M. (2000). Criteria for assessing the tools of disability outcomes research. Archives of Physical Medicine and Rehabilitation, 81 (Suppl. 2), S15–S20.
Bayley, M.T., Tate, R., Douglas, J.M., Turkstra, L.S., Ponsford, J., Stergiou-Kita, M., . . . Bragge, P. (2014). INCOG guidelines for cognitive rehabilitation following traumatic brain injury: Methods and overview. The Journal of Head Trauma Rehabilitation, 29 (4), 290306.
Bondy, M. (1974). Psychiatric antecedents of psychological testing (before Binet). Journal of the History of the Behavioral Sciences, 10 (2), 180194.
Bossuyt, P.M., Reitsma, J.B., Bruns, D.E., Gatsonis, C.A., Glasziou, P.P., Irwig, L., . . . De Vet, H.C. (2015). STARD 2015: An updated list of essential items for reporting diagnostic accuracy studies. Radiology, 277 (3), 826832.
Cardol, M., Beelen, A., van den Bos, G.A., de Jong, B.A., de Groot, I.J., & de Haan, R.J. (2002). Responsiveness of the Impact on Participation and Autonomy questionnaire. Archives of Physical Medicine and Rehabilitation, 83 (11), 15241529.
Cardol, M., de Haan, R.J., de Jong, B.A., van den Bos, G.A., & de Groot, I.J. (2001). Psychometric properties of the Impact on Participation and Autonomy questionnaire. Archives of Physical Medicine and Rehabilitation, 82 (2), 210216.
Cardol, M., de Haan, R.J., van den Bos, G.A., de Jong, B.A., & de Groot, I.J. (1999). The development of a handicap assessment questionnaire: The Impact on Participation and Autonomy (IPA). Clinical Rehabilitation, 13 (5), 411419.
Charters, E., Gillett, L., & Simpson, G.K. (2015). Efficacy of electronic portable assistive devices for people with acquired brain injury: A systematic review. Neuropsychological Rehabilitation, 25 (1), 82121.
Costa, D.S. (2015). Reflective, causal, and composite indicators of quality of life: A conceptual or an empirical distinction? Quality of Life Research, 24 (9), 20572065.
de Vet, H., Terwee, C., & Bouter, L. (2003). Clinimetrics and psychometrics: Two sides of the same coin. Journal of Clinical Epidemiology, 56 (12), 11461147.
de Vet, H.C., Terwee, C.B., Mokkink, L.B., & Knol, D.L. (2011a). Measurement in medicine: A practical guide. Cambridge: Cambridge University Press.
de Vet, H.C., Terwee, C.B., Mokkink, L.B., & Knol, D.L. (2011b). Systematic reviews of measurement properties. Measurement in medicine: A practical guide (pp. 275314). Cambridge: Cambridge University Press.
de Vet, H.C., Terwee, C.B., Ostelo, R.W., Beckerman, H., Knol, D.L., & Bouter, L.M. (2006). Minimal changes in health status questionnaires: Distinction between minimally detectable change and minimally important change. Health and Quality of Life Outcomes, 4 (1), 54.
DeVellis, R.F. (2003). Scale development: Theory and applications. Thousand Oaks, CA: Sage Publications.
Downs, S.H., & Black, N. (1998). The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions. Journal of Epidemiology & Community Health, 52 (6), 377384.
Dunn, T.J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105 (3), 399412.
Francis, D.O., McPheeters, M.L., Noud, M., Penson, D.F., & Feurer, I.D. (2016). Checklist to operationalize measurement characteristics of patient-reported outcome measures. Systematic Reviews, 5 (1), 129.
Frost, M.H., Reeve, B.B., Liepa, A.M., Stauffer, J.W., & Hays, R.D. (2007). What is sufficient evidence for the reliability and validity of patient-reported outcome measures? Value in Health, 10, S94–S105.
Gibby, R.E., & Zickar, M.J. (2008). A history of the early days of personality testing in American industry: An obsession with adjustment. History of Psychology, 11 (3), 164184.
Guyatt, G.H., Kirshner, B., & Jaeschke, R. (1992). Measuring health status: What are the necessary measurement properties? Journal of Clinical Epidemiology, 45 (12), 13411345.
Hayton, J.C., Allen, D.G., & Scarpello, V. (2004). Factor retention decisions in exploratory factor analysis: A tutorial on parallel analysis. Organizational Research Methods, 7 (2), 191205.
Howell, R.D., Breivik, E., & Wilcox, J.B. (2007). Reconsidering formative measurement. Psychological Methods, 12 (2), 205218.
Jarvis, C.B., MacKenzie, S.B., & Podsakoff, P.M. (2003). A critical review of construct indicators and measurement model misspecification in marketing and consumer research. Journal of Consumer Research, 30 (2), 199218.
Kirshner, B., & Guyatt, G. (1985). A methodological framework for assessing health indices. Journal of Chronic Diseases, 38 (1), 2736.
Kottner, J., Audigé, L., Brorson, S., Donner, A., Gajewski, B.J., Hróbjartsson, A., . . . Streiner, D.L. (2011). Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. International Journal of Nursing Studies, 48 (6), 661671.
Kratochwill, T.R., Hitchcock, J., Horner, R., Levin, J.R., Odom, S., Rindskopf, D., & Shadish, W. (2010). Single-case designs technical documentation what works clearinghouse. Retrieved from What Works Clearinghouse website: http://ies.ed.gov/ncee/wwc/pdf/wwc_scd.pdf.
Lohr, K.N., Aaronson, N.K., Alonso, J., Burnam, M.A., Patrick, D.L., Perrin, E.B., & Roberts, J.S. (1996). Evaluating quality-of-life and health status instruments: Development of scientific review criteria. Clinical Therapeutics, 18 (5), 979992.
MacCallum, R.C., Widaman, K.F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, 4 (1), 8499.
MacKenzie, S.B., Podsakoff, P.M., & Podsakoff, N.P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS Quarterly, 35 (2), 293334.
Maher, C.G., Sherrington, C., Herbert, R.D., Moseley, A.M., & Elkins, M. (2003). Reliability of the PEDro scale for rating quality of randomized controlled trials. Physical Therapy, 83 (8), 713721.
Mokkink, L.B., Terwee, C.B., Patrick, D.L., Alonso, J., Stratford, P.W., Knol, D.L., . . . De Vet, H.C. (2010a). The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: An international Delphi study. Quality of Life Research, 19 (4), 539549.
Mokkink, L.B., Terwee, C.B., Patrick, D.L., Alonso, J., Stratford, P.W., Knol, D.L., . . . de Vet, H.C. (2010b). The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. Journal of Clinical Epidemiology, 63 (7), 737745.
Mokkink, L.B., Terwee, C.B., Patrick, D.L., Alonso, J., Stratford, P.W., Knol, D.L., . . . de Vet, H.C. (2012). COSMIN checklist manual. Amsterdam, The Netherlands: University Medical Center.
Mokkink, L.B., Terwee, C.B., Stratford, P.W., Alonso, J., Patrick, D.L., Riphagen, I., . . . De Vet, H.C. (2009). Evaluation of the methodological quality of systematic reviews of health status measurement instruments. Quality of Life Research, 18 (3), 313333.
Noel-Storr, A.H., McCleery, J.M., Richard, E., Ritchie, C.W., Flicker, L., Cullum, S.J., . . . Rutjes, A.W. (2014). Reporting standards for studies of diagnostic test accuracy in dementia: The STARDdem Initiative. Neurology, 83 (4), 364373.
Revelle, W., & Zinbarg, R.E. (2009). Coefficients alpha, beta, omega, and the glb: Comments on Sijtsma. Psychometrika, 74 (1), 145154.
Rust, J., & Golombok, S. (2009). Modern psychometrics. The science of psychological assessment (3rd ed.). London: Routledge.
Schellingerhout, J.M., Verhagen, A.P., Heymans, M.W., Koes, B.W., Henrica, C., & Terwee, C.B. (2012). Measurement properties of disease-specific questionnaires in patients with neck pain: A systematic review. Quality of Life Research, 21 (4), 659670.
Schmidt, S., Garin, O., Pardo, Y., Valderas, J. M., Alonso, J., Rebollo, P., . . . Grp, E. (2014). Assessing quality of life in patients with prostate cancer: A systematic and standardized comparison of available instruments. Quality of Life Research, 23 (8), 21692181.
Schreiber, J.B., Nora, A., Stage, F.K., Barlow, E.A., & King, J. (2006). Reporting structural equation modeling and confirmatory factor analysis results: A review. The Journal of Educational Research, 99 (6), 323338.
Schulz, K.F., Altman, D.G., & Moher, D. (2010). CONSORT 2010 statement: Updated guidelines for reporting parallel group randomised trials. BMC Medicine, 8 (1), 18.
Scientific Advisory Committee of the Medical Outcomes Trust, Aaronson, N., Alonso, J., Burnam, A., Lohr, K. N., Patrick, D. L., . . . Stein, R. E. (2002). Assessing health status and quality-of-life instruments: Attributes and review criteria. Quality of Life Research, 11 (3), 193205.
Streiner, D. (2003a). Clinimetrics vs. psychometrics: An unnecessary distinction. Journal of Clinical Epidemiology, 56 (12), 11421145. doi: 10.1016/j. jclinepi.2003.08.011.
Streiner, D.L. (2003b). Test development: Two-sided coin or one-sided Möbius strip? Journal of Clinical Epidemiology, 56 (12), 11481149.
Streiner, D.L., & Kottner, J. (2014). Recommendations for reporting the results of studies of instrument and scale development and testing. Journal of Advanced Nursing, 70 (9), 19701979.
Streiner, D.L., Norman, G.R., & Cairney, J. (2015a). Health measurement scales (5th ed.). Oxford, UK: Oxford University Press.
Streiner, D.L., Norman, G.R., & Cairney, J. (2015b). Reporting test results. In Streiner, D.L., Norman, G.R., & Cairney, J. (Eds.), Health measurement scales (5th ed., pp. 349356). Oxford, UK: Oxford University Press.
Tang, W., Cui, Y., & Babenko, O. (2014). Internal consistency: Do we really know what it is and how to assess it. Journal of Psychology and Behavioral Science, 2 (2), 205220.
Tate, R.L., Perdices, M., Rosenkoetter, U., Shadish, W., Vohra, S., Barlow, D.H., . . . Wilson, B. (2016). The single-case reporting guideline in behavioural interventions (SCRIBE) 2016 statement. Archives of Scientific Psychology, 4 (1), 19.
Tate, R.L., Rosenkoetter, U., Wakim, D., Sigmundsdottir, L., Doubleday, J., Togher, L., . . . Perdices, M. (2015). The Risk-of-bias in N-of-1 Trials (RoBiNT) scale: An expanded manual for the critical appraisal of single-case reports. Sydney, Australia: The Author(s).
Terwee, C.B., Bot, S.D., de Boer, M.R., van der Windt, D.A., Knol, D.L., Dekker, J., . . . de Vet, H.C. (2007). Quality criteria were proposed for measurement properties of health status questionnaires. Journal of Clinical Epidemiology, 60 (1), 3442.
Terwee, C.B., Mokkink, L.B., Knol, D.L., Ostelo, R.W., Bouter, L.M., & de Vet, H.C. (2012). Rating the methodological quality in systematic reviews of studies on measurement properties: A scoring system for the COSMIN checklist. Quality of Life Research, 21 (4), 651657.
The AGREE Collaboration. (2003). Development and validation of an international appraisal instrument for assessing the quality of clinical practice guidelines: The AGREE project. Quality and Safety in Health Care, 12, 1823.
Trizano-Hermosilla, I., & Alvarado, J.M. (2016). Best alternatives to Cronbach's alpha reliability in realistic conditions: Congeneric and asymmetrical measurements. Frontiers in Psychology, 7, 769.
Turner‐Stokes, L., Pick, A., Nair, A., Disler, P.B., & Wade, D.T. (2015). Multi‐disciplinary rehabilitation for acquired brain injury in adults of working age. The Cochrane Library, Issue 12. Art. No.: CD004170.
Valderas, J.M., Ferrer, M., Mendívil, J., Garin, O., Rajmil, L., Herdman, M., & Alonso, J. (2008). Development of EMPRO: A tool for the standardized assessment of patient-reported outcome measures. Value in Health, 11 (4), 700708.

Keywords

Related content

Powered by UNSILO

Assessing Features of Psychometric Assessment Instruments: A Comparison of the COSMIN Checklist with Other Critical Appraisal Tools

  • Ulrike Rosenkoetter (a1) and Robyn L. Tate (a1)

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.