Home Imaging Technology and Practice Assessments
Post
Cancel

Imaging Technology and Practice Assessments

A recent study conducted by our research group revealed that there may be a significant “laboratory effect” in retrospective observer performance studies that may in turn limit the relevance of inferences generated as a result of these studies to the clinical environment . Even though we found a considerable consistency among the performance levels of the readers regardless of the specific method or rating scale used (ie, binary, receiver-operating characteristic [ROC], or free-response ROC [FROC]) , our study showed that radiologists may perform significantly different in the clinic than in the laboratory when reading the very same cases. The differences were reflected in terms of both their overall performance levels (eg, sensitivity and specificity) and, perhaps more important, the variability, or “spread,” among the observers’ performance levels . Although the results of our study should definitely be validated experimentally in other studies before general acceptance, there is a reasonably solid rationale for the observed outcome of our own study . Radiologists may perform differently in the laboratory in ways that are not a priori predictable; therefore, differences in their behavior cannot always be completely accounted for during retrospective studies. Even attempting to duplicate seemingly simple conditions such as practice guidelines (eg, aim for a 10% recall rate) during an experiment ultimately may not reflect actual behavior in the clinic. Clinical decisions that affect patient management cannot be duplicated in laboratory studies. Because differences in behavior are difficult to account for in retrospective observer performance studies, we, the investigators, must ask ourselves, What next? How do we proceed with appropriate evaluations of new technologies and practices in a manner that is both practical and clinically relevant?

In the 1980s, when a group of investigators, including myself, were working on stroke models and absolute measurements of regional and local cerebral perfusion using nonradioactive xenon computed tomography, the results of the Extracranial-Intracranial (EC/IC) Bypass Study were published. This randomized trial to assess the efficacy of this seemingly ideal surgical procedure, which at the time was rapidly growing in the number of operations performed around the world, showed that the procedure was actually not as beneficial as originally perceived by most. In effect, it was more harmful than beneficial in the general population to which it had been applied at the time.

This was a great surprise and a significant disappointment to the field of microvascular neurosurgery. As members of a group of investigators interested in this very question, we strongly believed that we had a very “appropriate” way, namely, to use xenon computed tomographic perfusion measurements to select a subset of the population in question who “should clearly benefit” from this surgical procedure, despite the overall negative results of the EC/IC Bypass Study.

Shortly after the results of the EC/IC Bypass Study were published, we met with the principal investigator of the study and his associates in London, Ontario, Canada, to discuss this very issue. At the end of a long day of much discussion, the point was made in no uncertain terms that “if we as investigators truly believe our hypothesis, and there are no conclusive data to support a definitive conclusion, we should design a prospective study that directly tests this hypothesis by randomizing the very select subset of the population into ‘surgery’ and ‘no-surgery’ arms, and there is no way around it!”

“You clearly have to have the guts to randomize them to ‘surgery’ and ‘no-surgery’ groups and leave the ‘no-surgery’ group alone,” the principal investigator of the EC/IC Bypass Study repeated. “Yes, I know that this goes directly against your belief, since you ‘know’ you are right, and you will try all types of study designs that may circumvent the need to ‘risk,’ at least in your opinion, half the members of the very group you strongly believe are most likely to benefit from the operation by actually not operating on them just to test your own hypothesis.” He paused, took a minute, and continued, “However, until you are willing to do so (randomize), you will never prove the point the way you should, and in the long run, society will never benefit from your work as it should or could.” Not being a physician myself, it was relatively easy for me to nod my head in agreement, but the neurosurgeons present felt strongly that there must be a way to avoid this “brute force” approach.

Although not perfect, a randomized controlled trial (RCT) is the most natural approach to studying these clinically important questions. In treatment-related studies (eg, medical, surgical, and radiation oncology), RCTs and/or similar approaches have been widely accepted and implemented for many years and in numerous studies. This type of study allows the assessment of outcomes in a largely natural progression of the populations being investigated under different intervention or treatment arms. In most, if not all, oncology-related studies of this nature, it is virtually impossible to perform a logical “or”-type clinical study; hence, the two arms are completely separated. Two large trials, the National Lung Screening Trial, a large diagnostic imaging-based RCT, headed cooperatively by the American College of Radiology Imaging Network (ACRIN), and the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial, clearly deserve much credit for leading the imaging diagnostic field in this regard .

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

Get Radiology Tree app to read full this article<

References

  • 1. Gur D., Bandos A.I., Cohen C.S., et. al.: The “laboratory” effect: comparing radiologists’ performance and variability during prospective clinical and laboratory mammography interpretations. Radiology 2008; 249: pp. 47-53.

  • 2. Gur D., Bandos A.I., King J.L., et. al.: Binary and multi-category ratings in a laboratory observer. performance study: a comparison. Med Phys 2008; 35: pp. 4404-4409.

  • 3. Gur D., Bandos A.I., Klym A.H., et. al.: Agreement of the order of overall performance levels under different reading paradigms. Acad Radiol 2008; 15: pp. 1567-1573.

  • 4. The EC/IC Bypass Study Group: Failure of extracranial-intracranial arterial bypass to reduce the risk of ischemic stroke. Results of an international randomized trial. N Engl J Med 1985; 313: pp. 1191-1200.

  • 5. Hillman B.J.: Economic, legal, and ethical rationales for the ACRIN National Lung Screening Trial of CT screening for lung cancer. Acad Radiol 2003; 10: pp. 349-350.

  • 6. Church T.R., National Lung Screening Trial Executive Committee: Chest radiography as the comparison for spiral CT in the National Lung Screening Trial. Acad Radiol 2003; 10: pp. 713-715.

  • 7. Pisano E.D., Gatsonis C.A., Yaffe M.J., et. al.: American College of Radiology Imaging Network digital mammographic imaging screening trial: objectives and methodology. Radiology 2005; 236: pp. 404-412.

  • 8. Gur D.: Retrospective analyses of pivotal prospective studies with population segmentation—statistically based inferences and clinical relevance. Acad Radiol 2008; 15: pp. 1458-1462.

  • 9. Gur D., Sumkin J.H., Rockette H.E., et. al.: Changes in breast cancer detection and mammography recall rates after the introduction of a computer-aided detection system. J Natl Cancer Inst 2004; 96: pp. 185-190.

  • 10. Fenton J.J., Taplin S.H., Carney P.A., et. al.: Influence of computer-aided detection on performance of screening mammography. N Engl J Med 2007; 356: pp. 1399-1409.

This post is licensed under CC BY 4.0 by the author.