ChatGPT is showing considerable promise for patient education in interventional radiology cases, according to a research group from Philipps University of Marburg in Germany.
In an article published on 23 October in CardioVascular and Interventional Radiology, the authors explored the use of GPT-3 and GPT-4 for in-depth patient education prior to interventional radiology procedures and found both versions of the model highly accurate.
“The feasibility and accuracy of these models suggest their promising role in revolutionizing patient care. Still, users need to be aware of possible limitations,” wrote first author Dr. Michael Scheschenja and colleagues.
Despite advancements and rising popularity of the field, many patients have a limited understanding of interventional procedures, the authors wrote. While the use of the internet to seek out health information is already a common phenomenon, the use of large language models like ChatGPT for such purposes requires validation, they added.
To that end, the group specifically explored the feasibility of using GPT-3 and GPT-4 for patient education prior to three common procedures: port implantations, percutaneous transluminal angioplasty, and transarterial chemoembolization.
The researchers posed 133 in-depth questions in English about these procedures, with the responses evaluated for accuracy by two radiologists with four and seven years of experience using a 5-point Likert scale. The questions covered various aspects of the procedure including general information, procedure preparation and the procedure itself, risks and complications, as well as postinterventional care.
Both GPT-3 and GPT-4 were deemed “completely correct” (scores of 5) in 31% and 35% of their responses and “very good” (scores of 4) in 48% and 47% of their responses, according to the findings.
In addition, GPT-3 and GPT-4 provided “acceptable” (scores of 3) responses 15.8% and 15.0% of the time. Lastly, GPT-3 was “mostly incorrect” (scores of 2) in 5.3% of its responses, while GPT-4 had a lower rate of such occurrences, at just 2.3%.
Ultimately, no response was identified as potentially harmful, the researchers noted.
“The results of this study demonstrate the potential of AI-driven language models, notably GPT-3 and GPT-4, as resources for specific patient education in [interventional radiology],” the group wrote.
The researchers noted pitfalls. Ambiguities may arise when ChatGPT deals with abbreviations, language comprehensibility may be an issue, and the chatbot’s lack of semantic understanding could lead to confusion over best practices versus obsolete information, they noted.
“Future studies should evaluate applicability of these models with real patients,” the researchers concluded.
The full article is available here.