Hostname: page-component-84b7d79bbc-g78kv Total loading time: 0 Render date: 2024-07-28T03:27:39.500Z Has data issue: false hasContentIssue false

Interrater Reliability: Item Analysis to Develop Valid Questions for Case-Study Scenarios

Published online by Cambridge University Press:  02 November 2020

Kelly Holmes
Affiliation:
Infection Prevention & Management Associates, Inc.
Mishga Moinuddin
Affiliation:
Infection Prevention & Management Associates, Inc.
Sandi Steinfeld
Affiliation:
Infection Prevention & Management Associates, Inc.
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Background: Development of an interrater reliability (IRR) process for healthcare-associated infection surveillance is a valuable learning tool for infection preventionists (IPs) and increases accuracy and consistency in applying National Healthcare Safety Network (NHSN) definitions (1-3). Case studies from numerous resources were distributed to infection preventionists of varying experience levels (4-6). Item analysis, including item difficulty index and item discrimination index, was applied to individual test questions to determine the validity of the case scenarios at measuring individual mastery of the NHSN surveillance definitions (7-8). Methods: Beginning in 2016, a mandatory internal IRR program was developed and distributed to infection preventionists (IPs) of varying experience level. Each year through 2019, a test containing 30–34 case studies was developed with multiple-choice questions. Case studies were analyzed using 2 statistical methods to determine item difficulty and validity of written scenarios. P values for each test question were calculated using the item difficulty index formula, with harder questions resulting in values closer to 0.0. Point biserial correlation was applied to each question to determine highly discriminating questions, measured in a range from −1.0 and 1.0. Results: Between 2016 and 2019, 124 questions were developed and 145 respondents participated in the mandatory IRR program. The overall test difficulty was 0.70 (range, 0.64–0.74). Moreover, 17 questions (14%) were determined to have high “excellent” discrimination, 41 questions (33%) were determined to have “good” discrimination, 57 questions (46%) were determined to have “poor” discrimination, and 9 questions (7%) were found to have negative discrimination values. Conclusions: IRR testing identifies educational opportunities for IPs responsible for the correct application of NHSN surveillance definitions. Valid test scenarios are foundational components of IRR tests. Case scenarios that are determined to have a high discrimination index should be used to develop future test questions to better assess mastery of application of surveillance definitions to clinical cases.

Funding: None

Disclosures: None

Type
Poster Presentations
Copyright
© 2020 by The Society for Healthcare Epidemiology of America. All rights reserved.