Interrater Reliability: Item Analysis to Develop Valid Questions for Case-Study Scenarios

Kelly Holmes; Mishga Moinuddin; Sandi Steinfeld

doi:10.1017/ice.2020.886

Interrater Reliability: Item Analysis to Develop Valid Questions for Case-Study Scenarios

Published online by Cambridge University Press: 02 November 2020

Kelly Holmes ,

Mishga Moinuddin and

Sandi Steinfeld

Show author details

Kelly Holmes: Affiliation:
Infection Prevention & Management Associates, Inc.
Mishga Moinuddin: Affiliation:
Infection Prevention & Management Associates, Inc.
Sandi Steinfeld: Affiliation:
Infection Prevention & Management Associates, Inc.

Article contents

Abstract

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Background: Development of an interrater reliability (IRR) process for healthcare-associated infection surveillance is a valuable learning tool for infection preventionists (IPs) and increases accuracy and consistency in applying National Healthcare Safety Network (NHSN) definitions (1-3). Case studies from numerous resources were distributed to infection preventionists of varying experience levels (4-6). Item analysis, including item difficulty index and item discrimination index, was applied to individual test questions to determine the validity of the case scenarios at measuring individual mastery of the NHSN surveillance definitions (7-8). Methods: Beginning in 2016, a mandatory internal IRR program was developed and distributed to infection preventionists (IPs) of varying experience level. Each year through 2019, a test containing 30–34 case studies was developed with multiple-choice questions. Case studies were analyzed using 2 statistical methods to determine item difficulty and validity of written scenarios. P values for each test question were calculated using the item difficulty index formula, with harder questions resulting in values closer to 0.0. Point biserial correlation was applied to each question to determine highly discriminating questions, measured in a range from −1.0 and 1.0. Results: Between 2016 and 2019, 124 questions were developed and 145 respondents participated in the mandatory IRR program. The overall test difficulty was 0.70 (range, 0.64–0.74). Moreover, 17 questions (14%) were determined to have high “excellent” discrimination, 41 questions (33%) were determined to have “good” discrimination, 57 questions (46%) were determined to have “poor” discrimination, and 9 questions (7%) were found to have negative discrimination values. Conclusions: IRR testing identifies educational opportunities for IPs responsible for the correct application of NHSN surveillance definitions. Valid test scenarios are foundational components of IRR tests. Case scenarios that are determined to have a high discrimination index should be used to develop future test questions to better assess mastery of application of surveillance definitions to clinical cases.

Funding: None

Disclosures: None

Type: Poster Presentations
Information: Infection Control & Hospital Epidemiology , Volume 41 , Issue S1: The Sixth Decennial International Conference on Healthcare-Associated Infections Abstracts, March 2020: Global Solutions to Antibiotic Resistance in Healthcare , October 2020 , pp. s303

DOI: https://doi.org/10.1017/ice.2020.886 [Opens in a new window]

Article contents

Interrater Reliability: Item Analysis to Develop Valid Questions for Case-Study Scenarios

Abstract

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests