| --- |
| license: cc-by-nc-4.0 |
| tags: |
| - mirror-detection |
| - image-segmentation |
| - computer-vision |
| - pytorch |
| --- |
| |
| # PMDNet — Progressive Mirror Detection |
|
|
| Pretrained weights for **PMDNet**, the model introduced in the CVPR 2020 paper [*Progressive Mirror Detection*](https://jiaying.link/cvpr2020-pgd/). |
|
|
| ## Model Description |
|
|
| PMDNet progressively detects mirror surfaces by leveraging multi-scale contrast cues and relational context. |
|
|
| **Architecture overview:** |
|
|
| - **Backbone** — ResNeXt-101 (32×4d), producing feature maps at four scales. |
| - **Contrast Module** (`Contrast_Module_Deep`) — at each scale, dilated convolutions capture local–context differences, then four stacked `Contrast_Block_Deep` units compute pairwise local–context subtractions at two dilation rates. Outputs are aggregated with CBAM (channel + spatial attention). |
| - **Relation Attention** (`Relation_Attention` / `RAttention`) — criss-cross attention over rows, columns, and both diagonals, enabling long-range relational reasoning without a full self-attention map. |
| - **Decoder** — four transposed-convolution upsampling stages with CBAM refinement produce intermediate saliency predictions (`f4 → f1`), each gated by the previous scale's prediction for progressive focus. |
| - **Edge Branch** — extracts edge features from `layer1` fused with high-level `cbam_4` context, producing an explicit edge map. |
| - **Refinement** — a single 1×1 conv fuses the original image, all four scale predictions, and the edge map into the final mirror mask. |
|
|
| **Input:** RGB image, resized to 416×416. |
| **Output (eval):** `(f4, f3, f2, f1, edge, final)` — sigmoid-activated predictions at input resolution. |
| Optional CRF post-processing is applied to the final prediction. |
|
|
| ## Weights |
|
|
| | File | Size | Description | |
| |------|------|-------------| |
| | `pmd.pth` | ~414 MB | Full model weights (ResNeXt-101 backbone + decoder) | |
|
|
| ## Usage |
|
|
| ```python |
| import torch |
| from torchvision import transforms |
| from PIL import Image |
| from model.pmd import PMD # from the official code release |
| |
| model = PMD() |
| model.load_state_dict(torch.load("pmd.pth", map_location="cpu")) |
| model.eval() |
| |
| transform = transforms.Compose([ |
| transforms.Resize((416, 416)), |
| transforms.ToTensor(), |
| transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), |
| ]) |
| |
| img = Image.open("your_image.jpg").convert("RGB") |
| x = transform(img).unsqueeze(0) |
| |
| with torch.no_grad(): |
| f4, f3, f2, f1, edge, final = model(x) |
| # `final` is the mirror mask prediction (values in [0, 1]) |
| ``` |
|
|
| Full inference script with CRF post-processing: see [`code_minimal/infer.py`](https://jiaying.link/cvpr2020-pgd/). |
|
|
| ## Dataset |
|
|
| Trained on the [PMD dataset](https://huggingface.co/datasets/garrying/PMD) (5,095 training images with mirror masks and edge maps). |
|
|
| ## Performance |
|
|
| | Method | F_β | MAE | |
| |-----------|-------|-------| |
| | EGNet | 0.672 | 0.087 | |
| | MirrorNet | 0.748 | 0.061 | |
| | **PMDNet (ours)** | **0.790** | **0.032** | |
| |
| Evaluated on the PMD test split (571 images). |
| |
| ## License |
| |
| CC BY-NC 4.0 — non-commercial use only. |
| |
| ## Citation |
| |
| ```bibtex |
| @INPROCEEDINGS{PMD:2020, |
| Author = {Jiaying Lin and Guodong Wang and Rynson W.H. Lau}, |
| Title = {Progressive Mirror Detection}, |
| Booktitle = {Proc. CVPR}, |
| Year = {2020} |
| } |
| ``` |
| |
| ## Contact |
| |
| csjylin@gmail.com |
| |