Home Receiver Operating Characteristic Analysis
Post
Cancel

Receiver Operating Characteristic Analysis

This issue of Academic Radiology is the second of two issues honoring the memory of Dr. Charles E. Metz, who pioneered the application of receiver operating characteristic (ROC) analysis for evaluating the diagnostic performance of radiology exams . Along with 10 articles that appeared in the December 2012 issue of Academic Radiology , the 15 articles featured in this issue represent some of the latest research in ROC analysis. Together, the 25 articles are a fitting tribute to the groundwork laid by Dr. Metz. If you examine the references cited in these articles, you will discover that a number of Dr. Metz’s seminal articles in ROC analysis have appeared in the pages of Academic Radiology .

ROC analysis can be complex, and it is filled with methodological nuances. To aid the reader, I have been asked by the Editor to place each of the articles in a context that is accessible to the practicing academic radiologist, as I did for the first Metz memorial issue .

The classic ROC curve represents an intrinsic property of a diagnostic test and plots the continuous tradeoff between sensitivity and specificity of the test. In practice, a radiologist chooses, consciously or subconsciously, one sensitivity/specificity point on the ROC curve at which to operate. But which operating point is optimum? The ROC curve by itself cannot answer this question. Defining the optimum operating point requires additional information, mainly the prevalence of the disease being diagnosed and the relative clinical value of the possible test outcomes (true positive, false positive, true negative, and false negative). This additional information also defines the overall clinical value of the diagnostic test when the optimum operating point on the ROC curve is used. Clinically speaking, the ROC curve by itself says little about the value of the diagnostic test.

Determining the relative clinical value of the possible diagnostic test outcomes involves utility theory and is beyond the scope of this editorial. Such work can be extremely difficult and requires answering problematic questions, such as how many false positives (“cost”) are worth one true positive (“benefit”). For the purpose of this editorial, we will assume these outcome values, known as utilities, are known.

The first four articles examine issues that arise when information about disease prevalence and clinical utilities are added to ROC analysis. In the first article, Abbey et al compare the statistical power of using utility as the main performance measure of a diagnostic test instead of the area under the ROC curve (AUC). Statistical power is an important study design consideration because it is the main determinant of how many participants must be recruited for a study to give statistically meaningful results. In the second article, Zou et al consider several common clinical utility metrics that have been used to define the optimum operating point of the ROC curve. The article demonstrates that, even for the same ROC data, the optimum operating point depends on which metric is chosen.

In actual clinical settings, radiologists’ interpretations are affected by their own perceptions of disease prevalence and clinical utility, especially the perceived “cost” of false negatives and false positives. For example, variations in such perceptions explain why some radiologists tend to be “overcallers” (higher sensitivity but lower specificity, on average) and some tend to be “undercallers” (lower sensitivity but higher specificity, on average). These perceptions and variations can cause difficulty when interpreting the results of ROC studies. In the third article, Samuelson demonstrates how such perceptions can cause sensitivity, specificity, and AUC reported in a ROC study not to apply to radiologists in a real clinical environment. In the fourth article, Nishikawa and Pesce show that sensitivity and specificity estimates from ROC studies of radiologists can be inconsistent with the clinical recommendations elicited directly from the same radiologists.

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

References

  • 1. Nishikawa R.M.: Charles E. Metz, PhD. Acad Radiol 2012; 19: pp. 1537-1538.

  • 2. Eng J.: Teaching receiver operating characteristic analysis: an interactive laboratory exercise. Acad Radiol 2012; 19: pp. 1452-1456.

  • 3. Alemayehu D., Zou K.H.: Applications of ROC analysis in medical research: recent developments and future directions. Acad Radiol 2012; 19: pp. 1457-1464.

  • 4. Parast L., Cai B., Bedayat A., et. al.: Statistical methods for predicting mortality in patients diagnosed with acute pulmonary embolism. Acad Radiol 2012; 19: pp. 1465-1473.

  • 5. Chakraborty D.P., Yoon H.J., Mello-Thoms C.: Application of threshold-bias independent analysis to eye-tracking and FROC data. Acad Radiol 2012; 19: pp. 1474-1483.

  • 6. McClish D.K.: Evaluation of the accuracy of medical tests in a region around the optimal point. Acad Radiol 2012; 19: pp. 1484-1490.

  • 7. Hillis S.L., Metz C.E.: An analytic expression for the binormal partial area under the ROC curve. Acad Radiol 2012; 19: pp. 1491-1498.

  • 8. Skaron A., Li K., Zhou X.H.: Statistical methods for MRMC ROC studies. Acad Radiol 2012; 19: pp. 1499-1507.

  • 9. Obuchowski N.A., Gallas B.D., Hillis S.L.: Multi-reader ROC studies with split-plot designs: a comparison of statistical methods. Acad Radiol 2012; 19: pp. 1508-1517.

  • 10. Hillis S.L.: Simulation of unequal-variance binormal multireader ROC decision data: an extension of the Roe and Metz simulation model. Acad Radiol 2012; 19: pp. 1518-1528.

  • 11. Li C., Glüer C.C., Eastell R., et. al.: Tree-structured subgroup analysis of receiver operating characteristic curves for diagnostic tests. Acad Radiol 2012; 19: pp. 1529-1536.

  • 12. Eng J.: Sampling the latest work in receiver operating characteristic analysis: what does it mean?. Acad Radiol 2012; 19: pp. 1449-1451.

  • 13. Hunink M.G.M., Glasziou P.P., Siegel J.E., et. al.: Valuing outcomes.Decision making in health and medicine: integrating evidence and values.2001.Cambridge University PressCambridge:pp. 88-127.

  • 14. Weinstein M.C., Fineberg H.V., Elstein A.S., et. al.: Utility analysis: clinical decisions involving many possible outcomes.Clinical decision analysis.1980.W. B. SaundersPhiladelphia:pp. 184-227.

  • 15. Abbey C.K., Samuelson F.W., Gallas B.D.: Statistical power considerations for a utility endpoint in observer performance studies. Acad Radiol 2013; 20: pp. 798-806.

  • 16. Zou K.H., Yu C.R., Liu K., et. al.: Optimal thresholds by maximizing or minimizing various metrics via ROC-type analysis. Acad Radiol 2013; 20: pp. 807-815.

  • 17. Samuelson F.W.: Inference based on diagnostic measures from studies of new imaging devices. Acad Radiol 2013; 20: pp. 816-824.

  • 18. Nishikawa R.M., Pesce L.L.: Estimating sensitivity and specificity for technology assessment based on observer studies. Acad Radiol 2013; 20: pp. 825-830.

  • 19. Gonen M.: Mixtures of receiver operating characteristic curves. Acad Radiol 2013; 20: pp. 831-837.

  • 20. Perkins N.J., Schisterman E.F., Vexler A.: Multivariate normally distributed biomarkers subject to limits of detection and ROC curve inference. Acad Radiol 2013; 20: pp. 838-846.

  • 21. Hillis S.L., Berbaum K.S., Metz C.E.: Recent developments in the Dorfman-Berbaum-Metz procedure for multireader ROC study analysis. Acad Radiol 2008; 15: pp. 647-661.

  • 22. Roe C.A., Metz C.E.: Dorfman-Berbaum-Metz method for statistical analysis of multireader, multimodality receiver operating characteristic data: validation with computer simulation. Acad Radiol 1997; 4: pp. 298-303.

  • 23. Dorfman D.D., Berbaum K.S., Metz C.E.: Receiver operating characteristic analysis: generalization to the population of readers and patients with the jackknife method. Invest Radiol 1992; 27: pp. 723-731.

  • 24. Drukker K., Horsch K.J., Pesce L.L., et. al.: Inter-reader scoring variability in an observer study using dual-modality imaging for breast cancer detection in women with dense breasts. Acad Radiol 2013; 20: pp. 847-853.

  • 25. Chakraborty D.P.: A brief history of FROC paradigm data analysis. Acad Radiol 2013; 20: pp. 915-919.

  • 26. Tang L., Kang L., Liu C., et. al.: An additive selection of markers to improve diagnostic accuracy based on a discriminatory measure. Acad Radiol 2013; 20: pp. 854-862.

  • 27. Pepe M.S., Fan J., Seymour C.W.: Estimating the ROC curve in studies that match controls to cases on covariates. Acad Radiol 2013; 20: pp. 863-873.

  • 28. Liu D., Zhou X.H.: ROC analysis in biomarker combination with covariate adjustment. Acad Radiol 2013; 20: pp. 874-882.

  • 29. Kolassa J., Seifu Y.: Nonparametric multivariate inference on shift parameters. Acad Radiol 2013; 20: pp. 883-888.

  • 30. Matthews G.J., Harel O.: An examination of data confidentiality and disclosure issues related to publication of empirical ROC curves. Acad Radiol 2013; 20: pp. 889-896.

  • 31. Jiang Y.: On the shape of the population ROC curve. Acad Radiol 2013; 20: pp. 897-907.

  • 32. Edwards D.C.: Validation of Monte Carlo estimates of three-class ideal observer operating points for normal data. Acad Radiol 2013; 20: pp. 908-914.

This post is licensed under CC BY 4.0 by the author.