--- tags: - chest-xray - radiology - visual-question-answering - differential-vqa - mimic-cxr license: apache-2.0 --- # LAPVQA — Differential VQA (Captioning-Pretrained Encoder) Part of the [LAPVQA collection](https://huggingface.co/collections/dmusingu/lapvqa). ## Description DiffVQA head trained on the frozen **LAPVQA captioning-pretrained encoder** ([`lapvqa-pretrain-captioning`](https://huggingface.co/dmusingu/lapvqa-pretrain-captioning)). Checkpoint is a plain `DiffVQAHead` state dict (vis_dim=1024). ## Results (test set) | BLEU-4 | ROUGE-2 | RadGraph-s | BERTScore F1 | |---|---|---|---| | 0.468 | 0.562 | 0.303 | 0.938 | ## Loading ```python import torch from lapvqa.diffvqa.model import DiffVQAHead ckpt = torch.load("pretrain-captioning_best.pt", map_location="cpu") head = DiffVQAHead(vis_dim=1024) head.load_state_dict(ckpt) head.eval() # pair with encoder_final.pt from lapvqa-pretrain-captioning ```