--- license: cc-by-nc-4.0 tags: - mirror-detection - image-segmentation - computer-vision - pytorch --- # PMDNet — Progressive Mirror Detection Pretrained weights for **PMDNet**, the model introduced in the CVPR 2020 paper [*Progressive Mirror Detection*](https://jiaying.link/cvpr2020-pgd/). ## Model Description PMDNet progressively detects mirror surfaces by leveraging multi-scale contrast cues and relational context. **Architecture overview:** - **Backbone** — ResNeXt-101 (32×4d), producing feature maps at four scales. - **Contrast Module** (`Contrast_Module_Deep`) — at each scale, dilated convolutions capture local–context differences, then four stacked `Contrast_Block_Deep` units compute pairwise local–context subtractions at two dilation rates. Outputs are aggregated with CBAM (channel + spatial attention). - **Relation Attention** (`Relation_Attention` / `RAttention`) — criss-cross attention over rows, columns, and both diagonals, enabling long-range relational reasoning without a full self-attention map. - **Decoder** — four transposed-convolution upsampling stages with CBAM refinement produce intermediate saliency predictions (`f4 → f1`), each gated by the previous scale's prediction for progressive focus. - **Edge Branch** — extracts edge features from `layer1` fused with high-level `cbam_4` context, producing an explicit edge map. - **Refinement** — a single 1×1 conv fuses the original image, all four scale predictions, and the edge map into the final mirror mask. **Input:** RGB image, resized to 416×416. **Output (eval):** `(f4, f3, f2, f1, edge, final)` — sigmoid-activated predictions at input resolution. Optional CRF post-processing is applied to the final prediction. ## Weights | File | Size | Description | |------|------|-------------| | `pmd.pth` | ~414 MB | Full model weights (ResNeXt-101 backbone + decoder) | ## Usage ```python import torch from torchvision import transforms from PIL import Image from model.pmd import PMD # from the official code release model = PMD() model.load_state_dict(torch.load("pmd.pth", map_location="cpu")) model.eval() transform = transforms.Compose([ transforms.Resize((416, 416)), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), ]) img = Image.open("your_image.jpg").convert("RGB") x = transform(img).unsqueeze(0) with torch.no_grad(): f4, f3, f2, f1, edge, final = model(x) # `final` is the mirror mask prediction (values in [0, 1]) ``` Full inference script with CRF post-processing: see [`code_minimal/infer.py`](https://jiaying.link/cvpr2020-pgd/). ## Dataset Trained on the [PMD dataset](https://huggingface.co/datasets/garrying/PMD) (5,095 training images with mirror masks and edge maps). ## Performance | Method | F_β | MAE | |-----------|-------|-------| | EGNet | 0.672 | 0.087 | | MirrorNet | 0.748 | 0.061 | | **PMDNet (ours)** | **0.790** | **0.032** | Evaluated on the PMD test split (571 images). ## License CC BY-NC 4.0 — non-commercial use only. ## Citation ```bibtex @INPROCEEDINGS{PMD:2020, Author = {Jiaying Lin and Guodong Wang and Rynson W.H. Lau}, Title = {Progressive Mirror Detection}, Booktitle = {Proc. CVPR}, Year = {2020} } ``` ## Contact csjylin@gmail.com