What Medicine needs to get right about AI
1 The AI Revolution in Medicine: Beyond the Hype
The transformer architecture has revolutionized AI, enabling systems to capture complex non-linear relationships in vast datasets. In medicine, this has led to remarkable capabilities:
- Clinical Communication: When applied to medical language, AI systems now understand medical context and can answer patient questions at a level comparable to or exceeding doctors
- Administrative Efficiency: When applied to human conversations, we can now automate clinical scribing and writing medical letters
- Workflow Enhancement: When applied to the EMR, with text-to-action & computer use, you could even automate tedious EMR navigations.
- Research Advancement: When applied to massive multi-omic biological data in data-rich fields like oncology, the next biomedical breakthroughs will be aided by AI foundation models.
2 The Implementation Challenge
We clinicians will, or already are, using AI tools at work. It’s crucial that we, as a field, speak the same language as those implementing these tools. This is to ensure patient safety (Epic’s Sepsis cautionary tale) and to use the tools properly. They are quite good, and we should make the most of them.
2.1 Understanding AI: Models vs Products
A crucial distinction often missed is that an AI model itself is not a product. Take OpenAI as an example - while they excel at building powerful models, their success with ChatGPT comes from transforming that model into a helpful assistant. As highlighted in this brilliant Stanford talk, considering the specific context and software surrounding the model allows us to be imaginative and practical.
2.2 The Clinical Decision Support Dilemma
Consider clinical decision support in radiology. While companies focus on creating high-performance diagnostic models, the implementation pathway remains unclear. There is practical use in screening and translating reports for patient understanding, but clinical practice implementation remains murky.
Currently, using the model, the main product being created is one that generates imaging reports. Here are some options:
2.3 Implementation Models
Human & AI Case Collaboration
Clinician works on the case at the same time as the AI
The AI report is visible for the clinician to use as desired
AI-First Verification
AI generates initial report
Clinician reviews and validates
Human-First Verification
Clinician writes initial report
AI system performs error check
Discrepancies trigger senior clinician review
AI as a Co-Worker
AI handles routine cases & calculates confidence/complexity metrics
Complex cases routed to senior clinician where appropriate
Without sufficient thought to human-computer interaction, it’s looking pretty bleak.
Options 1, 2 and likely 3 cause time-poor and stressed out radiologists. Option 1’s ‘helpful’ reporter product is like a genius who sometimes gets the hardest question right and sometimes the easiest question wrong. In a healthcare setting, there is limited value - more time will be spent on all discordant cases (which may not even result in better clinical performance). Option 2 is option 1 in disguise - you risk over-reliance or ignoring useful outputs. Option 3 is more useful; it sets clear boundaries on the human-AI relationship. By only making the AI visible in discordant cases, it may serve as a good tool to ‘triage’ scans up the chain of experience. However, you run into the same ‘Who is right?’ dilemma.
Financially, only option 4 makes sense to radiology practices and hospitals. Ide & Talamas describe this as an autonomous agent replacing routine work, displacing humans to more specialised problem-solving. If this leads to better patient outcomes, we must choose this option. However, we also need to face significant restructuring of training programs and retrain displaced early-career specialists.
3 Breaking Free from False Assumptions
Our limited options stem from several unfortunate assumptions/starting points:
Our best way to help radiologists is to diagnose for them
The best way to help radiologists is to write reports for them
AI is a black box that cannot truly reason, so we can’t truly understand it
This means that as long as we have high-quality training data of prior reports, we can generate high-quality reports and trust them
Reading medical imaging itself is a process. Why can’t we have asked questions like:
How can we automatically identify and show the radiologist the key references (Radiopaedia/StatDx) they would need to look at to solve this case?
Can we automatically show the patient’s last 5 CXRs, process them and identify exactly where changes have evolved?
Considering the speed of system 1 thinking, how can we best display anomaly detection with attached tree-of-thought reasoning traces while enabling a clinician’s systematic read of an image?
During dictation, can we let the radiologist think out in a very unstructured manner, offering real-time reasoning feedback as well as scribing a high quality radiology report?
Can we automate and adapt reporting for specific protocolised research guidelines?
Can we use LLMs to enhance inter-radiologist communication to get rapid second opinions from leading experts?
4 Why are we here?
Outside of a resource-poor setting, there is little unmet clinical need for an autonomous radiologist agent. The explosion of AI, the abundance of radiology reports and the monetary value in creating a high-quality autonomous agent all culminates in these foundation models that can perform exceptionally well.
However, given its training with human-labelled reports and diagnoses, I question if we can truly grow in medicine with these types of models. Can we get closer to ‘perfect medicine’ by having models that talk and breathe our biases?
Here is a direction I think would be more fruitful, we already have high-quality intelligent staff, why can’t we empower them to perform efficiently and improve to be their best? All of those 6 questions I’ve posed that aim to directly augment a radiologist’s work are tractable now. Note that they are useful products, not necessarily new models.
Unsupervised data-driven approaches can teach us so much about biomedicine - medicine will look incredibly different in the upcoming decades. We need nimble well-supported staff, with both autonomous AI and better non-autonomous copilots to maximise their clinical impact.
We’ll explore non-autonomous copilots and autonomous AI in more detail here including specifics of how we can think about human-AI interaction.