Radiology interpretation performance may slip late in sessions

Jul 29, 2014

Radiologists may take less time to interpret images and see changes in their performance over the course of a long reading session, according to new analysis from Warwick, U.K. But it's not clear yet if that dynamic represents an actual clinical concern or merely effects attributable to the study designs.

A research team led by Sian Taylor-Phillips, PhD, from the department of health sciences, and Markus Elze, a PhD student in medical statistics, both at Warwick University, re-evaluated data from six previously published studies that assessed radiologist performance over the course of reading sessions and shared their findings in a paper published online on 9 July in the Journal of Digital Imaging.

They found the time taken per case decreased between 9% and 23% for the four studies that included time information. In addition, they determined the three studies with the longest reading sessions also showed evidence of a decrease in sensitivity or an increase in specificity over the course of reading

While the researchers found that radiologist behavior and performance may change systematically over the course of a reading session, they cautioned that these changes could also be due to reader adaptation to the experiments, which featured enriched case sets.

"Further research is required to ascertain whether this effect is present in radiology clinical practice," they wrote.

The group wanted to determine whether reader behavior and performance changes over the course of a reading session. In particular, they sought to find out if there was a decline in sensitivity -- as described by the vigilance decrement theory (a decline in sensitivity to detecting targets associated with time spent on task), and/or a decrease in time taken per case, alongside a sensitivity decrease or specificity increase that would fit with the prevalence effect theory (i.e. that performance is dependent on the proportion of actual positive cases in a test set).

Significant differences

As the reading sessions progressed, the time taken per case decreased ranging from 9% to 23%, including when reading mammograms, fracture x-rays, and CT exams. The differences were all statistically significant (p < 0.005). Furthermore, the researchers found evidence of a decrease in sensitivity or increase in specificity over the course of reading 100 chest x-rays (p = 0.005), 60 bone fracture x-rays (p = 0.03), and 100 chest CT scans (p < 0.0001).

"In two out of five datasets, this manifested itself as a reduction in sensitivity, and in two out of five datasets as an increase in specificity," they noted. "These differences may be driven by differences in case mix. In the studies with shorter sets of cases (27-50 mammograms), this effect was not seen."

The researchers did note an unexpected increase in sensitivity as the reading session progressed when reading sets of 27 mammograms.

"The largest effect was seen when examining 100 chest CT scans, which had the largest case set with each case having the greatest number of images (multiple slice), and the time per case was limited in the experimental design, which are the conditions known to increase the vigilance decrement," they added.

Priming effect?

Data showing an increase in sensitivity at the beginning of a reading session, followed by a plateau at a certain level and then a decline as more cases are read -- all while accompanied by specificity increases -- suggest a "perceptual priming" effect. This would indicate that radiologists are "fine-tuned" to discover cancers earlier in the session, according to the researchers.

However, as the session moves along and "few cancers are actually found, the priming effect is reduced, which would be reflected in a lower sensitivity rate at the end of a long reading session," they wrote. "Conversely, priming would also make them better at deciding that cancer is not present, which would result in an increase in specificity as the session progresses."

The authors believe the theories and observations of both the vigilance decrement and the prevalence effect predict a larger change over time at lower prevalence.

"If the results observed here are due to these effects, then we would expect to see larger effects in radiology clinical practice," they wrote. "However, if the effects observed here are due to reader adaptation to the test set or experimental conditions in some way, then these effects will not translate into clinical practice. Further research is needed to determine if these effects do impact clinical reading as it would impact patient care."

Their findings "merit further investigation in a well-controlled environment with a larger participant cohort and optimal randomization to systematically measure how performance and eye-tracking behavior change over time at different levels of prevalence and with radiologist experience," they concluded. "Perhaps more importantly, radiologists' performance over time in clinical practice should be analyzed to determine whether changes over time predicted by the vigilance decrement and the prevalence effect manifest themselves in practice."