ECR: Ethical AI in radiology -- why safety begins after deployment

VIENNA -- The algorithms are here. New regulations aren't. And the clinical evidence is still catching up. At ECR 2026, a session on AI ethics examined what fills that gap, and who bears the cost.

U.K. Regulatory Consultant Dr. Hugh Harvey started with a baseline. "If an AI is a medical device, it is automatically classified as high-risk under the AI Act." From there, things got more layered.

How the AI Act is reshaping medical AI

Harvey is one of the clearest voices on what the EU AI Act actually means for healthcare. He walked through what the law means now and how it is soon going to shape medical AI development and deployment.

Dr. Hugh Harvey.Dr. Hugh Harvey.Claudia Tschabuschnig

Under the legislation, AI systems that qualify as medical devices automatically fall into the high-risk category. That triggers a chain of obligations: conformity assessments, risk management, postmarket monitoring -- all building on the framework already established under the Medical Device Regulation (MDR).

Then came the Digital Omnibus proposal. Stakeholders raised concerns about how the AI Act overlaps with existing medical device legislation, and the European Commission responded. The proposal aims to reduce regulatory friction and give industry more room to move. Compliance deadlines for high-risk AI systems classified under Annex 1 would shift from August 2026 to no later than Q2 2028, providing more time to align with standards that are, notably, still not finalized.

AI literacy requirements are also being renegotiated. Earlier drafts placed the responsibility on developers and deployers to train users. The Omnibus proposal shifts that toward EU institutions and member states. Harvey was skeptical.

"The people best placed to provide AI literacy are those working directly with the technology," he said.

The Act also introduces regulatory sandboxes -- supervised testing environments where developers can trial AI systems before wider deployment. Hospitals may even develop and share their own AI tools, a prospect that generated interest in the room, though the regulatory bar for doing so remains high.

From pilot to practice

Kicky van Leeuwen, PhD, from the Netherlands, made a point that landed quietly but stuck: The real work begins after deployment.

Kicky van Leeuwen, PhD.Kicky van Leeuwen, PhD.Claudia Tschabuschnig

Hospitals invest heavily in validation before going live, retrospective testing, acceptance testing, pilot phases. But Van Leeuwen argued that these processes often measure internal performance, not real clinical outcomes. The environment keeps shifting. Imaging equipment evolves. Software updates. Patient populations change. 

"The only constant is change," she said.

Most AI evaluations still chase technical metrics, sensitivity, specificity, and benchmark scores. Evidence of actual improvement in patient outcomes or healthcare efficiency remains limited. Van Leeuwen made the case for postmarket surveillance as the real standard: Continuous monitoring once a system is running in clinical practice, tracking uptime, processing latency, output drift, reporting times, workflow fit, and user adoption together.

How much oversight a system needs, she added, should reflect its clinical risk level. Higher stakes, higher scrutiny. That part isn't optional.

AI meets hospital reality

Dr. Susan Shelmerdine, PhD, from London, brought the case study that made everything concrete. A district hospital in southwest London. An AI triage tool for chest x-rays. Early on, a patient with shortness of breath got their x-ray flagged, a same-day CT followed, and the cancer diagnostic pathway moved faster.

Dr. Susan Shelmerdine, PhD.Dr. Susan Shelmerdine, PhD.Claudia Tschabuschnig

The numbers held up over time. CT scans within the national target timeframe climbed from around 20% to nearly 50%. Average time from abnormal x-ray to CT fell from roughly six days to about three and a half. Real improvements. Real patients. Then 2025 arrived. The system was withdrawn. Not because the algorithm failed. Not because outcomes deteriorated. The funding ran out. That was the whole story.

Shelmerdine used it to make a bigger point. Healthcare systems are complex socio-technical environments. Technology doesn't land in a vacuum, it lands inside staff roles, workflows, and institutional structures. AI rarely eliminates work. More often, it moves it. In this case, radiographers found themselves explaining suspicious findings to patients and coordinating urgent CT scans, responsibilities they hadn't carried before and weren't resourced to absorb.

She introduced the gap between "work as imagined" and "work as done," a concept from human factors research describing the distance between what a workflow looks like on paper and what actually happens in practice. The AI triage tool exposed that gap. It also produced what researchers call the "ironies of automation": radiologists grew dependent on the tool, while radiographers, who carried the added workload, felt relief the day it was switched off. Same system. Opposite experiences.

Harvey added one observation that cut through the room: "Sometimes a radiologist says no. An AI system never does."

Shelmerdine also raised the concept of moral injury, the particular distress felt by staff who spent years working in good faith to improve patient care only to be told the tool was gone -- not because it failed clinically but because of a budget decision. That damage doesn't show up in a performance metric.

Younger radiologists and trainees were hit hardest. Some found they could no longer report confidently once the system went offline. Dependency had developed faster than anyone had tracked. It raised a question Shelmerdine put plainly: The field has frameworks for responsible AI adoption. Does it have equivalent frameworks for responsible withdrawal? Skills maintenance, adequate notice, graduated step-downs, etc. Because cutting access cold is its own kind of failure.

Accuracy goes beyond algorithms

The conversation at ECR 2026 pointed somewhere the field is still catching up to. The hard part is no longer building an accurate algorithm. It is demonstrating real-world value, safe integration into clinical systems, and long-term sustainability -- three different problems, all harder than benchmark performance.

Autonomous and agentic AI systems are moving from conference slides into clinical practice, a shift visible at both ECR and last year's RSNA. Harvey flagged one implication that few have fully reckoned with: AI embedded within quality assurance processes is itself classified as high-risk and requires its own monitoring layer. The near-term prospect, then, is AI systems that must oversee other AI systems.

The real test for medical AI is not the benchmark. It is everything that comes after deployment.

Our full coverage of ECR 2026 can be found here.

Page 1 of 3
Next Page