Home Teaching Receiver Operating Characteristic Analysis
Post
Cancel

Teaching Receiver Operating Characteristic Analysis

Rationale and Objectives

Despite its fundamental importance in the evaluation of diagnostic tests, receiver operating characteristic (ROC) analysis is not easily understood. The purpose of this project was to create a learning experience that resulted in an intuitive understanding of the basic principles of ROC analysis.

Materials and Methods

An interactive laboratory exercise was developed for a class about radiology testing taught within a clinical epidemiology course between 2000 and 2009. The physician students in the course were clinical fellows from various medical specialties who were enrolled in a graduate degree program in clinical investigation. For the exercise, the class was divided into six groups. Each group interpreted radiographs from a set of 50 exams of the peripheral skeleton to determine the presence or absence of an acute fracture. Data from the class were pooled and given to each student. Students calculated the area under the ROC curve (AUC) corresponding to overall class performance. A binormal ROC curve was also fitted to the data from each class year.

Results

The laboratory exercise was conducted for 8 years with approximately 20–30 students per year. The mean AUC over the eight laboratory classes was 0.72 with a standard deviation of 0.08 (range, 0.60–0.85).

Conclusion

With some simplifications in design, an observer study can be conducted in a laboratory classroom setting. Participatory data collection promotes the intuitive understanding of ROC analysis principles.

Receiver operating characteristic (ROC) analysis is an established statistical method in the evaluation of diagnostic tests . In the clinical laboratory, ROC curves are typically constructed from numerical test results collected from a number of patients. The ROC curve shows all the possible sensitivity/specificity pairs that result from all possible cutoff values between normal and abnormal.

Radiology exams represent a type of medical test that usually does not produce a numerical result. Instead, the typical clinical question has a binomial purpose: to determine the presence or absence of a particular diagnosis. However, the interpretation of radiology exams often involves uncertainty. If we express this uncertainty on an ordinal scale, then we have data that are amenable to ROC analysis.

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Materials and methods

Radiograph Test Set

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Physician Students

Get Radiology Tree app to read full this article<

Interpretation Protocol

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Figure 1, Raw data collection form for radiograph interpretation. To complete the form, the students entered the case number and circled the diagnosis category for each case.

Get Radiology Tree app to read full this article<

Data Analysis

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Results

Get Radiology Tree app to read full this article<

Table 1

Example of Compiled Class Data Collected during Laboratory Exercise in 2007 ∗

Interpreted Diagnostic Category Number of True Positives Number of True Negatives Category 1: definitely negative411 Category 2: probably negative711 Category 3: probably positive32 Category 4: definitely positive111 Totals2525

Get Radiology Tree app to read full this article<

Table 2

Calculation of Sensitivity and Specificity at Each Possible Cutoff Point for Example Data from 2007 ∗

ROC Point Cutoff for Calling a Case Positive Sensitivity Specificity1 = Specificity A None 0.00 1.000.00 B Category = 4 only11/25 = 0.4424/25 = 0.960.04 C Category = 3 or more14/25 = 0.5622/25 = 0.880.12 D Category = 2 or more21/25 = 0.8411/25 = 0.440.56 E Category = 1 or more 1.00 0.001.00

Get Radiology Tree app to read full this article<

Figure 2, Plot of empirical receiver operating characteristic curve from sample calculations from 2007 shown in Table 2 . Each operating point is labeled according to Table 2 .

Table 3

Calculation of Area under Empirical Receiver Operating Characteristic Curve ∗

Region Width (Difference) Height of Left Point Height of Right Point Area Between points A and B0.040.000.440.0088 Between points B and C0.080.440.560.0400 Between points C and D0.440.560.840.3080 Between points D and E0.440.841.000.4048 Total0.7616

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Figure 3, Plot of fitted binormal receiver operating characteristic curves for each class participating in the annual laboratory exercise.

Table 4

Class Performance in Laboratory Exercise between 2000 and 2009 ∗

Year Area under Receiver Operating Characteristic Curve (SE) 2000 0.851 (0.059) 2001 0.759 (0.071) 2002 0.759 (0.076) 2003 0.602 (0.088) 2004 0.640 (0.083) 2006 0.679 (0.080) 2007 0.777 (0.072) 2008 0.695 (0.083)

SE, standard error of the fitted estimate.

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Discussion

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Acknowledgment

Get Radiology Tree app to read full this article<

References

  • 1. Lusted L.B.: Signal detectability and medical decision-making. Science 1971; 171: pp. 1217-1219.

  • 2. Metz C.E.: Basic principles of ROC analysis. Semin Nucl Med 1978; 8: pp. 283-298.

  • 3. Metz C.E.: ROC methodology in radiologic imaging. Invest Radiol 1986; 21: pp. 720-733.

  • 4. Hanley J.A., McNeil B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982; 143: pp. 29-36.

  • 5. Obuchowski N.A.: ROC analysis. Am J Roentgenol 2005; 184: pp. 364-372.

  • 6. Eng J.: Receiver operating characteristic analysis: a primer. Acad Radiol 2005; 12: pp. 909-916.

  • 7. Metz C.E.: ROC analysis in medical imaging: a tutorial review of the literature. Radiol Phys Technol 2008; 1: pp. 2-12.

  • 8. Scott W.W., Bluemke D.A., Mysko W.K., et. al.: Interpretation of emergency department radiographs by radiologists and emergency medicine physicians: teleradiology workstation versus radiograph readings. Radiology 1995; 195: pp. 223-229.

  • 9. Eng J., Mysko W.K., Weller G.E., et. al.: Interpretation of emergency department radiographs: a comparison of emergency medicine physicians with radiologists, residents with faculty, and film with digital display. Am J Roentgenol 2000; 175: pp. 1233-1238.

  • 10. Eng J. ROC analysis: web-based calculator for ROC curves. www.jrocfit.org . Accessed August 31, 2012.

  • 11. Dorfman D.D., Alf E.: Maximum likelihood estimation of parameters of signal detection theory and determination of confidence intervals: rating method data. J Math Psychol 1969; 6: pp. 487-496.

  • 12. Metz C.E.: Some practical issues of experimental design and data analysis in radiological ROC studies. Invest Radiol 1989; 24: pp. 234-245.

  • 13. Berbaum K.S., El-Khoury G.Y., Franken E.A., et. al.: Impact of clinical history on fracture detection with radiography. Radiology 1988; 168: pp. 507-511.

  • 14. Berbaum K.S., Franken E.A., El-Khoury G.Y.: Impact of clinical history on radiographic detection of fractures: a comparison of radiologists and orthopedists. Am J Roentgenol 1989; 153: pp. 1221-1224.

  • 15. Loy C.T., Irwig L.: Accuracy of diagnostic tests read with and without clinical information: a systematic review. JAMA 2004; 292: pp. 1602-1609.

  • 16. Gur D., Bandos A.I., Rockette H.E., et. al.: Is an ROC-type response truly always better than a binary response in observer performance studies?. Acad Radiol 2010; 17: pp. 639-645.

  • 17. Egglin T.K., Feinstein A.R.: Context bias: a problem in diagnostic radiology. JAMA 1996; 276: pp. 1752-1755.

  • 18. Yoon L.S., Haims A.H., Brink J.A., et. al.: Evaluation of an emergency radiology quality assurance program at a level I trauma center: abdominal and pelvic CT studies. Radiology 2002; 224: pp. 42-46.

  • 19. Hofvind S., Geller B.M., Rosenberg R.D., et. al.: Screening-detected breast cancers: discordant independent double reading in a population-based screening program. Radiology 2009; 253: pp. 652-660.

  • 20. Dorfman D.D., Berbaum K.S., Metz C.E.: Receiver operating characteristic analysis: generalization to the population of readers and patients with the jackknife method. Invest Radiol 1992; 27: pp. 723-731.

  • 21. Hillis S.L., Berbaum K.S., Metz C.E.: Recent developments in the Dorfman-Berbaum-Metz procedure for mulitreader ROC study analysis. Acad Radiol 2008; 15: pp. 647-661.

This post is licensed under CC BY 4.0 by the author.