In emergency radiology, the time between image acquisition and a signed report is the number that drives patient flow, bed management, and — in acute cases — clinical outcomes. The Joint Commission tracks it. Department heads obsess over it. For studies flagged as critical, the expectation at most academic centers is a report within 30 minutes of image acquisition. The national average for non-critical ED imaging sits closer to 90 minutes. At peak volume hours, it's worse.
When we began working with our first hospital deployment sites on workflow integration, turnaround time (TAT) was the metric they watched most carefully. Not diagnostic accuracy — they had benchmarks for that from prior studies. TAT was the operational pressure point, the number that determined whether the department was functioning or firefighting. Here's what we measured.
Baseline Data: Where Time Actually Goes
Before implementing any AI-based triage, we conducted a workflow audit at three sites — a Level I trauma center, a community hospital ED, and a large teleradiology practice. We instrumented the PACS to timestamp every state transition: study acquisition complete, study appears in radiologist worklist, radiologist opens study, report dictated, report signed.
The findings were consistent across all three sites. Time-to-worklist was short — typically under five minutes. The bottleneck was worklist-to-open, which averaged 47 minutes at the trauma center and 63 minutes at the community hospital. Studies weren't sitting in a queue because no one was available — they were sitting because radiologists were working through a FIFO queue while multiple time-sensitive studies waited behind routine ones.
At the trauma center, we identified 340 studies over a 90-day audit period where a study containing a time-sensitive finding — intracranial hemorrhage, pneumothorax, aortic abnormality — sat in queue for more than 40 minutes before being opened. Of those, 89 had other studies with lower clinical urgency queued ahead of them. The radiologist reading those lower-urgency studies wasn't doing anything wrong. They were reading in the order the studies arrived. But the consequence was delayed reads on findings that needed urgent attention.
AI Triage: Reordering the Queue
The intervention we implemented was worklist reordering based on AI-generated urgency scores. Every study entering the PACS is scored by the model within 90 seconds of image acquisition. Studies with positive predictions for specified critical findings — intracranial hemorrhage, large pneumothorax, pulmonary embolism on CTA, aortic dissection — are flagged and moved to the top of the active worklist. A secondary tier captures probable significant findings that aren't immediately life-threatening but warrant prioritization over routine reads.
This isn't a new concept. Triage-based worklist management has been discussed in radiology informatics literature for years. What's changed is the reliability of the AI predictions feeding the triage logic. Early CAD systems generated enough false positives that radiologists learned to distrust their alerts. The 12-15% false positive rate common in first-generation systems was enough friction that departments often disabled the prioritization features after a few weeks.
For our intracranial hemorrhage model, the false positive rate on the emergency cohort is 4.8%, with a sensitivity of 96.3%. For large pneumothorax, it's 3.1% false positive rate, 97.8% sensitivity. Those numbers are low enough that alert fatigue hasn't emerged as a problem in our deployed sites — radiologists are engaging with the alerts rather than dismissing them.
Measured Outcomes Across Three Sites
After six months of AI-assisted worklist management, the turnaround time data looked like this:
At the Level I trauma center, median TAT for critical-flagged studies dropped from 68 minutes to 19 minutes. The 90th percentile — the cases that had been taking the longest — dropped from 124 minutes to 41 minutes. Studies correctly triaged as critical were being opened within a median of 7 minutes of appearing on the worklist, down from 47.
At the community hospital, median TAT for critical studies dropped from 81 minutes to 22 minutes. More notable was the reduction in after-hours delays. The community hospital ran with a single on-call radiologist from 11pm to 7am. Under the old FIFO queue, after-hours critical studies averaged 94-minute TAT during low-volume overnight periods — not because of volume, but because the on-call radiologist was cycling through the full queue systematically. With AI triage, the on-call radiologist's queue surface showed critical studies immediately, and after-hours critical TAT dropped to 24 minutes.
At the teleradiology practice, where volume is higher and radiologists are reading across multiple client hospitals simultaneously, the impact was somewhat different. The overall TAT improvement was more modest — median critical TAT dropped from 58 to 31 minutes — but the variance decreased significantly. The 90th percentile dropped from 118 minutes to 49. Teleradiology environments, where a single radiologist may be managing three or four active worklists from different facilities, benefit substantially from a prioritization layer that consolidates critical findings across all queues.
What Didn't Change
Total departmental volume of reads per hour didn't change — and we weren't expecting it to. AI triage doesn't make radiologists read faster. It makes them read the right things first. Routine study TAT actually increased slightly at two of the three sites as critical studies were consistently moved ahead. That's the expected tradeoff of any prioritization system and it's the right clinical choice.
We also didn't see changes in report quality metrics. Discrepancy rates between preliminary and final reads held steady. The concern that faster reads on prioritized studies might translate to lower diagnostic quality didn't materialize in the data.
What we observed at all three sites was a shift in radiologist workflow perception. In post-deployment interviews, radiologists described feeling more in control of their queue — knowing that the study in front of them was there because it needed attention, not because it happened to arrive at a particular time. That's a harder thing to measure than TAT, but it matters for the sustainability of any workflow intervention.