Error rate analysis proves vital to improve speech recognition

After three years of use, radiologists at Cork University Hospital in the Republic of Ireland thought they'd mastered the process of using a speech recognition dictation and self-editing system. But when both radiologists and oncologists were finding obvious errors in radiology reports, the department decided to investigate.

Findings of a spot investigation were presented at a scientific session during RSNA 2012 held in Chicago. The bad news was that the investigation confirmed a problem; the good news is that the radiologists responded positively and have made changes.

In June 2008, the radiology department at Cork University Hospital in Wilton, Cork, installed a speech recognition dictation system (currently Philips SpeechMagic v.6.1, Nuance Communications) that was interfaced with its PACS (Impax, Agfa HealthCare). After a period of training, the radiologists did their own self-editing and final proofing. Report turnaround time decreased dramatically, from an average of two to three days to three to four hours for inpatient exams.

The hospital is a national cancer referral center, and, because of this, the radiology department does a large number of CT exams of cancer patients. Radiologists participate in many weekly multidisciplinary team meetings to discuss cancer cases. The radiology department receives a list in advance of the meetings of patients whose cases will be discussed.

Dr. Maria Twomey, specialist registrar in radiology, found discrepancies and errors in reports.

When reviewing relevant imaging and reports prior to these meetings, radiologists began to identify discrepancies and errors in finalized reports, according to Dr. Maria Twomey, specialist registrar in radiology. Oncologist colleagues also occasionally questioned why a report contained information that seemed obviously different than what they saw on images, such as the size of a tumor being stated in centimeters when it appeared to be the same size in millimeters.

Twomey and colleagues undertook an analysis of CT reports of cancer patients that had been prepared between June 2008 and December 2011. They then scrutinized 350 randomly selected reports for errors.

Reports containing errors were categorized as either significant but not considered likely to alter patient management or very significant with the meaning of the report affected. This latter group included nonsense phrases as a result of incorrect speech recognition. More important, the latter group had the potential to negatively affect patient management.

The researchers discovered 12% of the reports they analyzed contained errors. Of the total reports, 3% contained errors that could significantly alter patient management. This error rate was consistent throughout the time period of the study.

The majority of the significant errors represented word omission, in which a key word that could alter the meaning of the report had been dropped. An example that Twomey gave was the dictation of the phrase "within the pelvis there is significant lymphadenopathy," when what the radiologist had actually said was, "within the pelvis there is NO significant lymphadenopathy."

The second most common reason for significant errors related to measurements. It was common for "centimeters" to be replaced with "millimeters" and vice versa.

Reports with nonsignificant errors contained nonsensical phrases or word substitutions that had not been caught during proofreading. The researchers also identified grammatical errors.

The most surprising finding was that 60% of errors in all the reports were made in those read by two readers: a radiology trainee and the supervising consultant, according to Twomey. The errors probably occurred when the attending radiologist was modifying the report prepared by a resident trainee and failed to proofread the changes before signing off, she said.

Not unexpectedly, two-thirds of the reports that contained errors were finalized at the end of a day, between 4 p.m. and 6 p.m. Twomey attributed this to radiologists being tired and also rushed to complete the day's work.

The study's findings were presented at a department meeting. "My colleagues were surprised by what we had learned, although they were pleased to hear that the department error rate had dropped from a high of 14% for reports prepared in 2008 to a low of 3% in 2011 for this randomized sample," she said. "This showed we definitely had made progress."

"One of the things that we did to reduce our overall error rate was acknowledge that the speech engine had difficulty understanding our Irish accent. If we used American pronunciation for troubling words, the system understood us better and seemed to do a better job," Twomey said. "It also became apparent that people who worked to train the system got better results. It's not easy to transition from conventional dictation and working with the same transcriptionists for years to self-editing and proofreading."

A total of 16 attending radiologists and 10 residents had prepared the randomly selected reports. Interestingly, her colleagues came to Twomey to ask about their own error rates. "Everyone was very interested in learning how accurate they were, and what types of errors they had overlooked. My colleagues wanted to make positive changes to reduce their personal error rates," she said.

"Everyone realizes that they need to be more focused and not interrupted when they are proofing their reports. It is easy to see what one thinks has been dictated and overlook omissions. So people are particularly careful about looking for omissions and double checking measurements," she explained.

The department has adopted the use of structured reports as well, particularly for breast and prostate cancer imaging reports. This is improving efficiency and may be reducing the chance of a system interpretation error, Twomey told

The department also initiated a discrepancy meeting every month as a quality assurance initiative. The meeting is a constructive forum to discuss errors in interpretation and also serious report errors that have been discovered.

"People are encouraged to report errors. We track error patterns. This is the only way to learn what is being done incorrectly and to make improvements," she said, noting that the atmosphere is very constructive and positive.

In a year or two, the error rate will be reassessed. Twomey and colleagues intend to report the results.

