🩺 No Black Boxes:
Knowledge-Enhanced Causal AI Agents
for Healthcare

II-KEA is a Knowledge-Enhanced Agentic causal discovery framework that makes clinical AI both Interpretable & Interactable — transparent and clinician-driven.

📄 Read Paper ⭐ View Code

Deep Learning in Healthcare is a Black Box

Clinicians need to understand why a model makes a prediction — not just what it predicts. Current models fail on both counts.

✗ Traditional Deep Learning

  • Opaque reasoning — no explanations
  • No clinician interactivity or customization
  • Cannot incorporate domain expertise
  • Undermines clinician trust and adoption

✓ II-KEA (Our Approach)

  • Explicit causal reasoning & explanations
  • Clinicians can inject their own knowledge
  • Customizable goals and knowledge bases
  • Superior performance on real EHR datasets

How II-KEA Works

Three LLM agents collaborate to go from a patient's diagnosis history to an interpretable, causally-grounded prediction.

INPUT
🧑‍⚕️
Patient History
EHR · Diagnosis Records
ICD-9Multi-visit
📊
Transition Matrix
Disease Probabilities AT
0.30.40.0
0.30.20.6
0.50.40.1
🎯
Candidate Diseases
Shortlisted by Matrix
✓ Hypertension
✓ Diabetes
+ more…
Agent 1
🤖
Knowledge Synthesis
RAG · Vector DB
"Medical knowledge related to conditions associated with…"
Agent 2
🤖
Causal Discovery
DAG · Fitting Scores
"Produce a DAG to represent causality… Output in JSON form."
🔄 Repeat w/ Fitting Scores
Agent 3
🤖
Decision Making
Prediction · Clinician
"…the set of diseases the patient may be diagnosed in the future."
👨‍⚕️ Clinician-in-Loop
OUTPUT
📋
Diagnosis
Ranked ICD-9 Codes
💬
Explanations
Reasoning Chain
🕸️
Causal Graph
DAG.json

State-of-the-Art Performance

II-KEA is evaluated on two real-world EHR benchmarks and achieves superior performance while providing interpretability that pure deep learning models cannot.

MIMIC-III
Modelw-F1R@10R@20
RETAIN18.3732.1232.54
Dipole14.6628.7329.44
SeqCare24.3637.4740.53
GT-BEHRT25.2136.1540.97
GraphCare25.1636.7441.89
DualMAR25.3738.2441.86
II-KEA (Ours)28.6138.5243.86
MIMIC-IV
Modelw-F1R@10R@20
RETAIN23.1137.3240.15
Dipole22.1636.2138.74
SeqCare26.1242.9146.25
GT-BEHRT30.1744.9350.67
GraphCare27.5942.0748.19
DualMAR27.9744.0748.19
II-KEA (Ours)29.8745.6651.73

Results reported as average (%) over 5 runs. w-F1 = weighted F1; R@k = Recall@k.

🧠

Causal Transparency

Every prediction comes with a causal graph showing which prior conditions likely caused the new diagnosis.

💬

Clinician-in-the-Loop

Clinicians can add their own knowledge sources and inject comments to personalize predictions.

📖

Knowledge-Grounded RAG

Retrieval Augmented Generation ensures predictions align with up-to-date medical literature.

🔄

Iterative Refinement

The causal discovery agent continuously improves the graph using data fitting scores until convergence.


What's Next

II-KEA opens a promising paradigm for interpretable and interactive clinical AI. Here's where we're headed.

📚

Richer Domain Knowledge

The current system uses Wikipedia as a proof-of-concept knowledge base. Future work will integrate more specialized medical knowledge sources to improve fine-grained diagnosis prediction.

🩻

Broader Clinical Tasks

Beyond diagnosis prediction, we plan to extend II-KEA to support treatment planning, medication recommendation, and other clinician-facing tasks.

👥

Multi-Stakeholder Collaboration

Current interactions are limited to individual clinicians. Future iterations will enable collaborative decision-making involving multiple stakeholders for holistic, patient-centered care.


Authors

Stevens Institute of Technology  ·  Department of Computer Science

Explore the Code & Paper

II-KEA is open source. Dive into the implementation, datasets, and experiments.

📄 EMNLP Paper ⭐ GitHub Repo