Speech recognition errors surprise Canadian breast facility

Mar 29, 2011

When radiologists use speech recognition technology to create reports and then do their own editing, report turnaround times in an imaging department often improve dramatically. But the trade-off for speed of delivery can be an increased error rate, as indicated in a study from the 2011 ECR.

In a poster presentation on abnormal findings in breast imaging reports, principal investigator Dr. Anabel Scaranelo, PhD, an assistant professor of breast imaging at the University of Toronto, reported that signed reports created using speech recognition technology were eight times more likely to contain major errors than those created by conventional dictation and transcription.

Radiologists affiliated with the University of Toronto compared the error rate of all breast imaging reports reviewed at multidisciplinary tumor board meetings during a 15-month time period. The reports were equally divided between those prepared conventionally and those prepared using a speech recognition system.

"The study was conducted to identify what, if any, quality gaps existed when a new technology intended to improve the performance of our breast imaging specialists, residents, and fellows was implemented," Scaranelo said. "Our objective was to determine if we needed to make any additional changes."

The study was one of several quality initiatives undertaken by the Joint Department of Medical Imaging at the University Health Network, Mount Sinai Hospital, and Women's College Hospital, all affiliated with the University of Toronto, she added.

Scaranelo and colleagues reviewed 307 reports prepared conventionally by radiologists working at the Women's College Hospital from January 2009 through March 2010, as well as 308 reports prepared during the same time frame from Princess Margaret Hospital, where a speech recognition system was used. The radiologists worked at both hospitals. The objective was to identify the number of major and minor uncaught errors in the reports, and to analyze the error rate based on different factors.

The study found that 23% of the reports generated through the speech recognition system contained at least one significant error, compared with only 4% that were prepared and proofread by a transcriptionist from conventional dictation.

Significant report errors by type of imaging study

	Speech recognition	Conventional dictation
Mammography reports	15%	0%
Breast MRI reports	35%	7%
Interventional procedures	13%	4%

The researchers found no substantial differences in error rates between reports prepared by staff radiologists compared with those by residents and fellows. Error rates were also comparable between native and non-native English language speakers.

The researchers defined minor errors as misspellings, an incidental missing word, or a nonsensical word or phrase that did not have any impact on the findings. Major errors included the following:

Omission or the addition of a critical word in a phrase relating to diagnosis
The substitution of a word with another that had relevancy
A change in measurement that would impact patient management

Scaranelo gave examples of each:

"Mammographic signs of malignancy" instead of "no mammographic signs of malignancy"
The use of the word "duodenum" instead of "gadolinium" in the sentence, "There is no definite evidence of suspicious enhancement in either breast after the ______ administration."
2.0 mm instead of 2.0 cm

All the hospitals included in the study have plans to replace traditional dictation and transcription with speech recognition systems. To date, a system has been installed only at Women's College Hospital, where it has been used since January 2009 to report breast studies. The project has experienced some delays, and Scaranelo said that this is providing an opportunity for the other radiologists to learn from the experience of the breast specialists.

The hospital's information technology department had extensively prepared the system to recognize the terminology and sentence structure used by the radiologists. Scaranelo estimated that more than 5,000 taped dictations generated by 11 radiologists were "fed" into the system.

Since the breast service already had templates for many of its reports, this group was the first to use the system. The radiologists received extensive training, and all actively interacted with the system to expand vocabulary and train it with respect to misinterpretation of words or phrases. Some radiologists with a unique pattern of speech and a non-native English language accent needed to do more work with the system than others, but this was not a difficulty.

The study began three months after an interim training period. Up to this point, signed reports were reviewed by office support staff to catch overlooked errors.

The percentage of uncaught major errors was a surprise to everyone, Scaranelo said, especially since radiologists seemed to be adhering to the report proofing protocol.

"We discussed the results of the study with the entire team," she said. "We didn't name names because uncaught major errors were rather evenly spread throughout the reports of the entire team. There was not a statistical prevalence of one or several radiologists."

The results showed that breast MRI was a potential source of errors, and the group has discussed the use of standardized templates for breast MRI exams. They have also discussed whether being interrupted by a colleague or a telephone call during reporting and review process increases the possibility of overlooking errors in the report generated by the speech recognition system. The group may make a change in report proofing protocol if it confirms that this is the case, Scaranelo said.

One change made as a result of the study was the addition of "canned" text and more template features. Anecdotally, this has contributed to fewer errors being made that need to be caught and corrected, according to Scaranelo. Her colleagues are also more attuned to the fact that they need to be more diligent in proofreading. Scaranelo thinks that everyone is trying harder to reduce the error rate.

The researchers would like other specialty groups in the department, such as musculoskeletal, chest, and neuroradiology, to conduct similar studies when it is appropriate to do so.

"This would make a very interesting comparative study that others might benefit from as well as ourselves," she concluded.