Evidence-first generation
Answer generation only follows an accepted EMR/CPG evidence bundle, rather than raw top-K context.
MedSwin
Evidence Observatory
Clinical RAG · EMR + CPG · audit trail
MedSwin combines patient-specific EMR retrieval, clinical practice guidelines, biomedical reranking, specialist critique, and provenance tracking so each answer can be traced back to the evidence that shaped it.
Live evidence sonar
Hover a beacon to inspect an evidence channel.
System principles
The system tracks provenance, calibrated relevance, facet sufficiency, contradiction handling, agent reliability, benchmark performance, and release artifacts as distinct parts of the clinical evidence workflow.
Answer generation only follows an accepted EMR/CPG evidence bundle, rather than raw top-K context.
Reranker logits are converted to calibrated probabilities for threshold-based policy checks.
Contraindications, interactions, exclusions, and contradictions remain visible throughout selection.
Each claim keeps source metadata, score context, evidence grade, and facet role.
Architecture animation
A clinician query is decomposed into patient context and guideline evidence, reranked with biomedical relevance signals, checked for sufficiency, and returned with a grounded answer plus audit trail.
Pipeline playback
Medical specialist
The specialist model is built from supervised biomedical instruction data, teacher-student distillation, and training-free merge operators that control destructive interference between updates.
SFT mixture
Hybrid SFT + KD
Interference-aware composition
Retrieval and sufficiency
Each retrieval stage uses a responsive diagram, and the sufficiency simulator animates toward threshold when more evidence is retrieved.
Facet sufficiency simulator
Multi-agent coordination
Agents explore different hypotheses, return claim-level ledgers, and are aggregated through reliability-weighted evidence selection.
Claim-level ledger
| Facet | Role | Polarity | Grade | Trace |
|---|---|---|---|---|
| Guideline concordance | Recommendation | supports | CPG | doc · version · section |
| Patient applicability | Lab / comorbidity | qualifies | EMR | encounter · timestamp |
| Safety risk | Contraindication | conflicts | Safety | severity · source |
| Uncertainty | Contradiction pair | preserved | Mixed | adjudication status |
Evaluation dashboard
Overlap metrics, semantic similarity, retrieval quality, latency, and audit completeness are kept in separate views so answer similarity is not mistaken for deployment readiness.
| Model | ROUGE-L | BERT-F1 | Token F1 | Uni Prec | Bi Prec |
|---|
MSAS component families
Overlap and semantic metrics are useful signals, but unsupported claims, unsafe omissions, missing provenance, and unresolved contradictions are tracked separately because answer similarity alone is not enough for deployment safety.
Reproducibility and release scope
The release path covers data filters, teacher-label utilities, QLoRA scripts, merge specifications, retrieval policy packages, and evaluation harnesses.
Grounded clinical assistant
The final system keeps patient context, guideline passages, sufficiency decisions, contradiction handling, and provenance together so clinicians can inspect how an answer was formed.
Return to surface