Skip to main content Accessibility help
  • Print publication year: 2014
  • Online publication date: October 2014

7 - Multiple test results


Even though the diagnostic radiologist examines black-and-white images, the information that is derived from the images is hardly ever black-and-white.

M.G. Myriam Hunink


In the previous chapters we focused on dichotomous test results, e.g., fecal occult blood is either present or absent. Test results can conveniently be dichotomized, and thinking in terms of dichotomous test results is generally helpful. Distinguishing patients with and without the target disease is useful for the purpose of subsequent decision making because most medical actions are dichotomous. In reality, however, most test results have more than two possible outcomes. Test results can be categorical, ordinal, or continuous. For example, categories of a diagnostic imaging test may be defined by key findings on the images. These categories may be ordered (intuitively) according to the observer’s confidence in the diagnosis, based on the findings. As an example, abnormalities seen on mammography are commonly reported as definitely malignant, probably malignant, possibly malignant, probably benign, or definitely benign. As we shall see later in this chapter, it makes sense to order the categories (explicitly) according to increasing likelihood ratio (LR). Some test results are inherently ordinal, e.g., the five categories of a Papanicolaou smear (test for cervical cancer) are ordinal. Results of biochemical tests are usually given on a continuous scale, which may be reduced to an ordinal scale by grouping the test results. Thus, a test result on a continuous scale can be considered a result on an ordinal scale with an infinite number of very narrow categories. Scores from prediction models are on an ordinal scale if there are a finite number of possible scores, and on a continuous scale if there are an infinite number of scores. When test results are categorical, ordinal, or continuous, we have to consider many test results Ri, where i can be any value from two (the case we have considered in Chapter 5 and Chapter 6, T+ and T−) up to any number of categories. Interpretation of a test result on an ordinal scale can be considered a generalization of the situation of dichotomous test results.

Diamond, GA, Forrester, JS. Analysis of probability as an aid in the clinical diagnosis of coronary artery disease. N Engl J Med. 1979;300(24):1350–8.
Genders, TS, Steyerberg, EW, Hunink, MG, et al. Prediction model to estimate presence of coronary artery disease: retrospective pooled analysis of existing cohorts. BMJ. 2012;344:e3485.
Pryor, DB, Harrell, FE, Lee, KL, Califf, RM, Rosati, RA. Estimating the likelihood of significant coronary artery disease. Am J Med. 1983;75(5):771–80.
Puylaert, JB, Rutgers, PH, Lalisang, RI, et al. A prospective study of ultrasonography in the diagnosis of appendicitis. N Engl J Med. 1987;317(11):666–9.
Steyerberg, EW. Clinical Prediction Models: A practical approach to development, validation, and updating. Springer: 2009.
Steyerberg, EW, Vickers, AJ, Cook, NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21(1):128–38.
Hanley, JA, McNeil, BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology. 1983;148(3):839–43.
Efron, B, Tibshirani, RJ. An Introduction To The Bootstrap. CRC Press; 1998.
Hanley, JA. The robustness of the ‘binormal’ assumptions used in fitting ROC curves. Med Decis Making. 1988;8(3):197–203.
Swets, JA. ROC analysis applied to the evaluation of medical imaging techniques. Invest Radiol. 1979;14(2):109–21.
Dorfman, DD, Alf, E. Maximum likelihood estimation of parameters of signal detection theory – a direct solution. Psychometrika 1968; 117–24.
Phelps, CE, Mushlin, AI. Focusing technology assessment using medical decision theory. Med Decis Making. 1988;8:279–89.
Pencina, MJ, D’Agostino, RB, D’Agostino, RB, Vasan, RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008;27(2):157–72; discussion 207–12.
Pencina, MJ, D’Agostino, RB, Steyerberg, EW. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat Med. 2011;30(1):11–21.
Van Calster, B, Vickers, AJ, Pencina, MJ, et al. Evaluation of markers and risk prediction models: overview of relationships between NRI and decision-analytic measures. Med Decis Making. 2013;33(4):490–501.
Vickers, AJ, Elkin, EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26(6):565–74.
Vickers, AJ, Cronin, AM, Elkin, EB, Gonen, M. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Medical Informatics & Decision Making. 2008;8:53.
Vickers, AJ, Elkin, EB, Steyerberg, E. Net reclassification improvement and decision theory. Stat Med. 2009;28(3):525–6; author reply, 6–8.
Baker, SG. Putting risk prediction in perspective: relative utility curves. J Natl Cancer Inst. 2009;101(22):1538–42.
Baker, SG, Cook, NR, Vickers, A, Kramer, BS. Using relative utility curves to evaluate risk prediction. J R Stat Soc Ser A Stat Soc. 2009;172(4):729–48.
Genders, TS, Spronk, S, Stijnen, T, et al. Methods for calculating sensitivity and specificity of clustered data: a tutorial. Radiology. 2012;265(3):910–16.
Steinbach, WR, Richter, K. Multiple classification and receiver operating characteristic (ROC) analysis. Med Decis Making. 1987;7(4):234–7.
Chakraborty, DP, Winter, LHL. Free-response methodology: alternate analysis and a new observer-performance experiment. Radiology. 1990;174:873–81.
Ananth, CV, Kleinbaum, DG. Regression models for ordinal responses: a review of methods and applications. Int J Epidemiol. 1997;26(6):1323–33.
Harrell, FE, Margolis, PA, Gove, S, et al. Development of a clinical prediction model for an ordinal outcome: the World Health Organization Multicentre Study of Clinical Signs and Etiological Agents of Pneumonia, Sepsis and Meningitis in Young Infants. WHO/ARI Young Infant Multicentre Study Group. Stat Med. 1998;17(8):909–44.
Dreiseitl, S, Ohno-Machado, L, Binder, M. Comparing three-class diagnostic tests by three-way ROC analysis. Med Decis Making. 2000;20(3):323–31.
O’Malley, AJ, Zou, KH. Bayesian multivariate hierarchical transformation models for ROC analysis. Stat Med. 2006;25(3):459–79.
Hunink, MGM, Kuntz, KM, Fleischmann, KE, Brady, TJ. Noninvasive imaging for the diagnosis of coronary artery disease: focusing the development of new diagnostic technology. Ann Intern Med. 1999;131(9):673–80.