Clinical RAG · EMR + CPG · audit trail

MedSwin grounds clinical answers in patient records and guideline evidence.

MedSwin combines patient-specific EMR retrieval, clinical practice guidelines, biomedical reranking, specialist critique, and provenance tracking so each answer can be traced back to the evidence that shaped it.

Live evidence sonar

Clinical channels under audit

4,895 m
MedSwinevidence lock

Hover a beacon to inspect an evidence channel.

0B student
0B teacher
0agents
0% Recall@10

System principles

MedSwin separates retrieval, evidence quality, safety critique, and answer synthesis.

The system tracks provenance, calibrated relevance, facet sufficiency, contradiction handling, agent reliability, benchmark performance, and release artifacts as distinct parts of the clinical evidence workflow.

01

Evidence-first generation

Answer generation only follows an accepted EMR/CPG evidence bundle, rather than raw top-K context.

02

Calibrated inclusion

Reranker logits are converted to calibrated probabilities for threshold-based policy checks.

03

Safety is a first-class facet

Contraindications, interactions, exclusions, and contradictions remain visible throughout selection.

04

Auditability over confidence

Each claim keeps source metadata, score context, evidence grade, and facet role.

Architecture animation

Trace a clinical query through the evidence chamber.

A clinician query is decomposed into patient context and guideline evidence, reranked with biomedical relevance signals, checked for sufficiency, and returned with a grounded answer plus audit trail.

Pipeline playback

Clinician query enters the chamber.

Private Clinical System Clinician Input / Case Query Patient ID Case query Patient-Specific EMR Patient Grounded EMR CPG / Guidelines Guideline Support Query + EMR Context Orchestrator Stage 1: Hybrid Biomedical Retrieval EMR candidates + CPG candidates Stage 2: Biomedical Reranker Iterative Reasoning Loop Grounded Clinical Answer

Medical specialist

Training, distillation, and model merging are displayed as separate mechanisms.

The specialist model is built from supervised biomedical instruction data, teacher-student distillation, and training-free merge operators that control destructive interference between updates.

SFT mixture

Biomedical supervision sources

augmented

Hybrid SFT + KD

Teacher-student transfer

QLoRA
27BTeacher
hard labelstop-k soft logitsuncertainty transfer
7BStudent
Lt = α CE + (1 − α)τ² KL(pT(·|τ) ∥ pS(·))

Interference-aware composition

Merge operators reduce destructive update conflict.

training-free

Retrieval and sufficiency

Evidence is selected by clinical utility, not by raw top-K truncation.

Each retrieval stage uses a responsive diagram, and the sufficiency simulator animates toward threshold when more evidence is retrieved.

Facet sufficiency simulator

Build an evidence bundle

Critical facets are below acceptance threshold.

Multi-agent coordination

The MAC layer behaves like a specialist dive team.

Agents explore different hypotheses, return claim-level ledgers, and are aggregated through reliability-weighted evidence selection.

Claim-level ledger

Every claim keeps its source role.

audit artifact
FacetRolePolarityGradeTrace
Guideline concordanceRecommendationsupportsCPGdoc · version · section
Patient applicabilityLab / comorbidityqualifiesEMRencounter · timestamp
Safety riskContraindicationconflictsSafetyseverity · source
UncertaintyContradiction pairpreservedMixedadjudication status
Contradictions are not averaged away.High-grade conflicts are preserved until the critic or final synthesiser explicitly adjudicates them.

Evaluation dashboard

QA, reranking, and audit metrics are inspected separately.

Overlap metrics, semantic similarity, retrieval quality, latency, and audit completeness are kept in separate views so answer similarity is not mistaken for deployment readiness.

ModelROUGE-LBERT-F1Token F1Uni PrecBi Prec

MSAS component families

Auditability is separated from answer similarity.

    Clinical interpretation boundary

    Overlap and semantic metrics are useful signals, but unsupported claims, unsafe omissions, missing provenance, and unresolved contradictions are tracked separately because answer similarity alone is not enough for deployment safety.

    Reproducibility and release scope

    The final build presents MedSwin as a reconstructable pipeline.

    The release path covers data filters, teacher-label utilities, QLoRA scripts, merge specifications, retrieval policy packages, and evaluation harnesses.

    Grounded clinical assistant

    MedSwin returns evidence-linked answers rather than unsupported narratives.

    The final system keeps patient context, guideline passages, sufficiency decisions, contradiction handling, and provenance together so clinicians can inspect how an answer was formed.

    Return to surface