Researchers tackle false-positive challenge of CAD

May 4, 2011

Mammography computer-aided detection (CAD) software can help detect more cancers, albeit at the cost of numerous false-positive findings. But new approaches may allow radiologists to more effectively use CAD, according to a pair of presentations at the recent ECR in Vienna.

A team from Siemens Healthcare found that CAD could be used to automatically presort a subset of cases with increased cancer prevalence, making it more likely for radiologists to accept true-positive CAD marks. In another trial, the same researchers found that nearly 10% of cases could be categorized by CAD as either definitely normal or obvious cancers, bypassing the need for radiologist review.

Many marks generated by CAD are false, pointing to structures not related to malignancy. And since most cases are normal in a screening environment, radiologists assume CAD marks are false and are reluctant to accept them, said presenter Isaac Leichter, PhD, co-founder and chief science officer of Siemens Computer Aided Diagnosis.

But if a CAD application running in the background could flag a subset of cases with increased cancer prevalence, reader performance in this subset should improve, Leichter said.

"Under these circumstances, the reader should be more willing to accept true CAD marks," he said.

Since breast cancers are usually visible in two views, most cancers are marked by CAD in corresponding locations in both views. As a result, CAD marks of the same type (masses or calcifications) that are visible on both views would be considered matching CAD marks by the algorithm, Leichter added.

Normal cases should have matching CAD marks only by coincidence, as false CAD marks are almost always marked in only one view, he said.

"Matching CAD marks should therefore create a subset with increased cancer prevalence," he said.

Using this approach, a Siemens prototype CAD algorithm calculates a certainty score (ranging from 0-100) reflecting the likelihood of malignancy for each possible finding. Only candidates with a certainty score above a defined threshold are selected, with all others filtered out, Leichter said.

This threshold defines the operating point of the algorithm. A low threshold implies filtering out very few candidates, resulting in extremely high sensitivity and a higher false-mark rate. However, CAD designed for tagging a subset of cases with "matching" marks may require a different operating point, Leichter said.

As a result, the researchers sought to automatically and prospectively presort all cases with matching CAD marks, hoping to determine the operating point that generates an optimal subset of cases with increased breast cancer prevalence, he said.

The study involved 15,892 full-field digital mammography (FFDM) exams, 280 of which were malignant. The CAD application was then used on the cases with varying thresholds for certainty scores.

For each operating point, all malignant and normal cases with matching CAD marks were automatically identified in order to create the presorted subset.

The researchers found that the operating point of the CAD algorithm affected the cancer prevalence in the subset; high CAD sensitivity could be maintained up to a certainty score threshold of 80.

"As the threshold increased, the percentage of cases with matching CAD marks decreased gradually for malignant cases but decreased quite rapidly for normal cases," he said.

As a result, the prevalence of cancers increased in the presorted subset. At a threshold of 80, 10.2% of the normal cases had matching CAD marks, compared with 65% of malignant cases.

At a threshold of 80, cancer prevalence in the presorted subset increased by a factor of 5.82 in comparison with the full dataset, while capturing nearly two-thirds of the cancers, he said. With a threshold of 90, cancer prevalence increased by a factor of 10.4 while capturing about half of cancers.

"Automatic presorting by matching CAD marks can be used to substantially increase the prevalence of breast caner in a prioritized subset," Leichter said. "CAD can be used prospectively in the background in a novel way to flag those cases requiring special attention."

Extreme CAD

The Siemens team also evaluated their CAD prototype for automated cherry-picking of cases that could bypass radiologist review.

The algorithm first identifies all possible mammographic lesion candidates, calculating a certainty score for each candidate that reflects its estimated likelihood of malignancy. Only candidates with a certainty score above a selected threshold are marked by the algorithm, with others filtered out. The number of displayed CAD marks depends on the preselected threshold that defines the operating point of the algorithm.

The researchers sought to evaluate the use of two extreme thresholds, aiming to capture cases that were definitely normal and those that were obviously cancer.

When the algorithm is run with a low threshold for the certainty score, very few candidates are filtered out. This results in extremely high sensitivity, but also higher numbers of false marks.

"Due to the extremely high sensitivity, almost all cases with no CAD marks should be normal," he said.

Ideally, true CAD marks should be displayed as matching marks in both views. However, matching CAD marks do appear in normal cases as well, Leichter said.

When the algorithm is run with a high threshold for the certainty score, most candidates are filtered out, resulting in a lower sensitivity and an extremely low false-mark rate.

Under those conditions, almost all cases with matching CAD marks should be cancers, he said.

When the algorithm is run in the background twice using both extremely low and high thresholds, it functions as an "extreme CAD" system, Leichter said. CAD can identify cases that are definitely normal by using the low threshold and look for cases with no marks. Using the high threshold, matching marks would be considered obvious cancers and these cases could be automatically recalled.

"Thus, this novel bimodal CAD application could enable automatic 'cherry picking' of both definitely normal cases and obvious cancers," he said.

The study team used the same 15,892 FFDM exams to evaluate the ability of their "extreme CAD" technique to correctly cherry-pick these cases.

A threshold of 10 for the certainty score yielded 1,411 normal cases and only one cancer with no CAD marks. This allowed 9% of the normal cases to be read only by the computer, with only 0.36% missed cancers, Leichter said.

A threshold of 100 for the certainty score yielded 84 cancers with only nine normal cases.

"Thus, the high arm of extreme CAD allows 71.1% of the cancers to be automatically recalled along with only 0.6% of normal cases," he said. "With optimized operating points, extreme CAD can be used to cherry pick the obvious cancers and the definitely normal cases."

As a result, the use of this bimodal CAD technique enabled 9.3% of the cases to be read only by a computer, with the remaining cases still requiring interpretation by a radiologist, Leichter said.

"The reader's workload is substantially reduced by identifying the normal cases using the algorithm run with an extremely low threshold," he said. "Thus, the greatest impact of extreme CAD is due to the low arm."