We can draw from a rich literature of human-computer interaction (HCI), applied automation in aviation and now emerging human-centered AI (HCAI) to think about medical AI. Many new problems we’re running into now are older problems in disguise. In this article, we’ll draw on premises from key implementation challenges.
Key Context
This essay builds on foundational research from Jiang et al and expands their framework with practical medical applications.
The practical implementation of AI in professional work faces a critical challenge: how to effectively integrate automated systems while preserving human expertise and judgment. While current literature provides theoretical frameworks, we need concrete strategies for real-world application.
💡Key Context
Knowledge work automation represents a fundamental shift - AI’s embedded world models make human knowledge scalable beyond the previous limitation of human time.
AI tools exist on a spectrum from autonomous to assistive:
ℹ️Note
Most implementations will likely fall somewhere between fully autonomous and purely assistive, adapting based on the specific context and risk level of the domain. For example, a medical copilot might autonomously gather drug interaction data while requiring human oversight for final dosing decisions.
For effective integration, well-trained staff is critical. Jeremy Howard’s new law firm, Virgil, aims for bottom-up integration of AI with staff education and AI integration from the get-go.
Similar to law, medicine is a field where risk is crucially minimised. Patient safety must be at the forefront. Jiang et al have identified 3 key tensions we need to solve before we can integrate AI. We’ll largely summarise their fantastic paper while applying it to medicine - please read their full paper and cite them if used.
Before examining the tensions in human-AI interaction, we need to understand Situation Awareness (SA). While SA originated in aviation, it offers valuable insights into how humans and AI systems interact.
At its core, SA describes how humans gather and process information to solve problems. Think of it as the foundation that supports decision-making - separate from the decisions themselves.
Even skilled decision-makers can make mistakes if their SA is incomplete or incorrect. Similarly, someone with excellent awareness might still make wrong choices due to gaps in knowledge or training.
SA unfolds in three distinct levels:
Situation Awareness
Let’s explore how SA works in a hospital setting:
Picture yourself on a ward round. You’re taking in everything around the patient - their appearance, vital signs, recent test results, and examination changes. You notice the electronic medical record (EMR) displaying key data, while also being aware of the broader environment: available nursing staff, family presence, and the treating team. This initial stage builds your mental model of the current situation.
Now comes synthesis. A doctor combines all these elements into meaningful patterns. For instance, seeing low blood pressure alongside cool extremities and reduced urine output forms a clear picture of poor tissue perfusion - a gestalt that’s more meaningful than any single observation.
This is where experience and implicit learning shine. Based on the patterns recognized, a clinician can anticipate likely developments. They might predict that a patient with worsening respiratory symptoms and dropping oxygen levels will need intensive care soon, allowing them to start preparations early.
Traditional SA focuses on individual decision-makers, but modern healthcare demands a broader view.
The Distributed Situation Awareness model recognizes that cognition is shared across a network of human agents, technological tools, and now AI systems. This creates a complex environment where the whole system’s capabilities exceed what any individual could achieve alone.
Distributed Situation Awareness
Take a modern aircraft landing. The pilot doesn’t manually calculate wind speeds, drag coefficients, or optimal flap positions. Instead, they interact with a carefully designed system that:
The pilot achieves “perfect” task awareness not by knowing every calculation, but by understanding the right information at the right time. They trust the system to handle complex calculations while focusing on higher-level decisions about the landing approach.
This is the concept of “transactional memory”. Information is distributed across people and technology. Healthcare has always relied on distributed knowledge:
The introduction of AI creates new possibilities for knowledge work:
1. Dynamic Knowledge Retrieval
A completely new balance emerges from AI on distributed situation awareness. AI capabilities can complement human expertise in ways previously impossible.
Medicine provides a crucial case study in AI integration, where patient safety is paramount. Drawing from Jiang et al.’s work, three fundamental tensions emerge:
💡Tip
You need the right ‘window’ into the assistant AI, with the right information at the right time for a highly trained clinician. This enables effective human-AI collaboration that doesn’t restrict human agency & correctly uses AI.
Consider a future where AI manages:
Better investigation can help us better characterise disease but this likely requires significant data analysis. This data analysis may mean greater diagnostic accuracy and treatment precision, leading to improved patient outcomes. However, the tension would arise when the intelligent system makes clinical recommendations based upon complex biomarker patterns that are opaque to the human physician and at a level of molecular detail that the human physician cannot fully process, leading to a mismatch in shared understanding between humans and AI. This negates the benefit gained from AI-powered data analysis.
Consider even far sooner future where AI looks through the EMR and flags where treatment deviates from guidelines: The clinician needs to be trained to understand deeply how these AI-driven systems work and also how/when these systems fail. Automation bias refers to the deskilling of staff, shown to disproportionally affect non-specialist doctors in ECG interpretation. This means users need to be substantial domain experts in both medicine & applied AI to safely stay in the loop.
To improve human agency, SA is applied as such:
Current Status (Level 1 SA - Perception)
Reasoning Process (Level 2 SA - Comprehension)
Future Projections (Level 3 SA - Projection)
- Shows the AI’s goals and current actions
- Displays environmental data and performance metrics
- Helps users see what the AI sees
- Explains why the AI makes specific choices
- Shows constraints affecting AI decisions
- Helps users understand AI behaviour
- Predicts next steps
- Helps users see what the reasoning leads to
To illustrate these principles, consider an AI system supporting skin cancer screening:
A dermatologist examines a patient with multiple atypical moles. The AI flags one as high-risk melanoma (89% confidence), though it doesn’t match typical patterns. Here are important considerations to retain clinician agency & augment performance.
Current Status (Level 1 SA - Perception)
Reasoning Process (Level 2 SA - Comprehension)
Future Projections (Level 3 SA - Projection)
- Clear images of the lesion
- Enhanced visualizations
- Feature detection results
- Heat maps of concerning areas
- Breakdown of each concerning feature
- Similar cases from training data
- Notes on atypical features
- Statistical outcome predictions
- Recommended monitoring schedule
- Growth projections
- Alternative diagnoses to consider given biopsy results
💡Benefits for Physician Autonomy
This approach maintains physician control & creates effective AI-physician collaboration by critically:
- Making AI decisions transparent in the way the physician thinks (AI SA supports physician SA)
- Allowing investigation of AI recommendations because fits into physician SA (AI SA supports physician SA)
Consider an AI clinical decision support system. It uses all of the available multimodal data it has (e.g. all EMR data, all imaging, all relevant guidelines) and suggests next actions in a clinical workflow.
Without intelligently handling the known unknowns and the unknown unknowns, the system risks communicating itself with overconfidence, harming physician autonomy and trust.
A 58-year-old patient presents to the Emergency Department with abdominal pain. The hospital’s AI system analyzes available data:
EMR review:
Based on these findings and clinical guidelines, the AI system confidently recommends an immediate surgical consultation for an appendicectomy/appendectomy.
However, critical information remains outside the AI’s analysis:
Known gaps
Potential unknowns
- Recent travel history (not in EMR)
- Current medication list (undocumented)
- Precise onset timing of pain
- Anatomical variations
- Unusual pathogens
- Undocumented family history of inflammatory conditions
The treating physician, drawing on years of clinical experience and subtle patient cues, senses something atypical about the presentation. Rather than proceeding directly to surgery, they pursue additional workup - ultimately discovering a rare parasitic infection contracted during the patient’s recent international travel, which had mimicked appendicitis.
This scenario demonstrates how an AI system that doesn’t explicitly acknowledge uncertainty could prematurely narrow the diagnostic consideration and potentially override valuable physician intuition and clinical judgment. A better system would present its recommendations with appropriate caveats and confidence levels, explicitly noting what information is missing or uncertain, better supporting physician decision-making.
Applying SA in this scenario means expressing the uncertainty of the AI system. The extent to which displaying these uncertainty metrics actually improves physician confidence and decision-making remains unclear - would overall outcomes improve if it showed 65% confidence with clear knowledge gaps, versus 85% confidence without acknowledging uncertainties? Does breaking down uncertainty into specific components help or hinder clinical workflow?
An SA-oriented design would structure uncertainty representation across three cognitive levels, helping physicians build a complete mental model of the situation:
Level 1: Perception
Level 2: Comprehension
Level 3: Projection
Shows what data is missing from the AI’s patient model - including physical exam nuances, patient affect, and symptom progression
Shows how different uncertainties combine to affect the AI’s confidence levels
Enables physicians to actively explore different diagnostic and treatment pathways while considering uncertainties
Displays standard data like CT and lab results alongside clear indicators of what information isn’t captured
Demonstrates relationships between factors (e.g., how unusual pain patterns + incomplete medication history + unclear symptom timing affect diagnostic confidence)
Allows you to drill into specific concerns (like possible travel-related infections), update the AI’s reasoning to determine change in projection (“If this patient did travel to India for 3 months, what are you differentials now?”)
Makes gaps in the AI’s understanding explicit to help doctors put recommendations in context
Reveals the AI’s reasoning process rather than just showing isolated confidence scores
This approach recognizes that managing clinical uncertainty isn’t passive - physicians need to actively engage with the information, controlling how they view and interpret different uncertainty elements based on their expertise and the specific patient context. By organising uncertainty information around clinical goals and supporting both detailed symptom analysis and broader diagnostic consideration, the system helps physicians develop their own situation assessment while maintaining appropriate confidence in both the AI’s suggestions and their own clinical judgment.
This design philosophy helps harmonize the tension between the AI’s inherent limitations and the physician’s need for confident decision-making, without compromising the crucial role of clinical expertise and intuition.
This final tension underlies the entire human-AI interaction. Modern clinical AI systems are inherently complex, integrating multiple components and data sources. In our ED scenario, the system simultaneously processes:
As the systems grows in complexity, choosing what to show to clinicians grows in importance.
Like how micro-calculations of flap adjustment doesn’t need to be visible to the pilot, physicians need a balanced view, enough to maintain trust while preventing information overload. The operational complexity can remain manageable with a moderate perceived complexity, despite significant objective complexity.
💡Complexity Definitions
This is the hardest balance to strike. Perceived complexity changes vastly based on interface design decisions:
An SA approach would manage complexity through a layered approach:
Routine Cases
(Low Complexity):
Complex/Atypical Cases
(Medium Complexity):
Emergent Situations
(High Time Pressure):
The system should adapt its complexity presentation based on factors like:
This framework recognizes that perceived complexity isn’t static - what seems straightforward at the start of a shift might feel overwhelming during a busy night. By structuring information presentation across these varying complexity levels, the system supports physicians’ natural diagnostic reasoning while maintaining their autonomy.
For example, in our appendicitis case:
This SA-oriented approach helps physicians maintain situational awareness without being overwhelmed by the system’s inherent complexity, ultimately supporting better clinical decision-making while preserving physician autonomy and confidence.
Medical AI integration requires careful attention to three key tensions:
The successful integration of AI in medicine depends on understanding how humans and machines can work together effectively. By applying lessons from aviation and human-computer interaction, we can design AI systems that enhance rather than diminish medical expertise.
Situation awareness provides a valuable framework for addressing these challenges. When AI systems are designed to support clinician perception, comprehension, and projection, they can genuinely augment medical decision-making rather than attempting to replace clinical judgment.
The future of medical AI lies not in autonomous systems that work in isolation, but in thoughtfully designed tools that strengthen the doctor-patient relationship and improve clinical outcomes. As we continue developing these systems, maintaining this human-centered perspective will be crucial for realizing the full potential of AI in healthcare.