# Dataset Sources This document describes the datasets and preprocessing sources used in this project. ### MIMIC-CXR - **Image & Reports**: [MIMIC-CXR-JPG v2.1.0](https://physionet.org/content/mimic-cxr-jpg/2.1.0/) ### OpenI - **Images**: [Chest X-rays Indiana University (Kaggle)](https://www.kaggle.com/datasets/raddar/chest-xrays-indiana-university) - **Preprocessing**: [CARZero (CVPR 2024)](https://github.com/laihaoran/CARZero) — minor path modifications in CSV files ### PadChest - **Images**: [BIMCV PadChest](https://bimcv.cipf.es/bimcv-projects/padchest/) - **Preprocessing**: [CARZero (CVPR 2024)](https://github.com/laihaoran/CARZero) — minor path modifications in CSV files ### ChestXray14 - **Images**: [NIH ChestXray](https://nihcc.app.box.com/v/ChestXray-NIHCC/folder/37178474737) - **Preprocessing**: [CARZero (CVPR 2024)](https://github.com/laihaoran/CARZero) — minor path modifications in CSV files ### CheXpert - **Images**: [Stanford CheXpert](https://stanfordmlgroup.github.io/competitions/chexpert/) - **Preprocessing**: [CARZero (CVPR 2024)](https://github.com/laihaoran/CARZero) — minor path modifications in CSV files ### ChestXDet10 - **Images**: [Deepwise AILab ChestX-Det10](https://github.com/Deepwise-AILab/ChestX-Det10-Dataset) - **Preprocessing**: [CARZero (CVPR 2024)](https://github.com/laihaoran/CARZero) — minor path modifications in CSV files ### SIIM - **Images**: [SIIM-ACR Pneumothorax Segmentation (Kaggle)](https://www.kaggle.com/datasets/jesperdramsch/siim-acr-pneumothorax-segmentation-data) - **Preprocessing**: [MGCA (NeurIPS 2022)](https://github.com/HKU-MedAI/MGCA/blob/main/mgca/preprocess/siim.py) ### RSNA - **Images**: [RSNA Pneumonia Detection Challenge 2018](https://www.rsna.org/artificial-intelligence/ai-image-challenge/rsna-pneumonia-detection-challenge-2018) - **Preprocessing**: [MedKLIP (ICCV 2023)](https://github.com/MediaBrain-SJTU/MedKLIP/blob/main/Sample_Zero-Shot_Grounding_RSNA/data_sample/test.csv) — only file paths in CSV files were modified ### MS-CXR - **Images**: [MIMIC-CXR-JPG v2.1.0](https://physionet.org/content/mimic-cxr-jpg/2.1.0/) - **Dataset**: [MS-CXR v0.1](https://physionet.org/content/ms-cxr/0.1/) - **Test Split**: [MedRPG (MICCAI 2023)](https://github.com/eraserNut/MedRPG/tree/master/data/MS_CXR)