GlassSemNet / README.md
garrying's picture
Upload README.md with huggingface_hub
a88d290 verified
---
license: cc-by-nc-4.0
task_categories:
- image-segmentation
tags:
- glass-surface-detection
- semantic-segmentation
- scene-understanding
- pytorch
pretty_name: GlassSemNet (Glass Semantic Network)
---
# GlassSemNet — Glass Semantic Network
Pre-trained weights for **GlassSemNet**, introduced in:
> **Exploiting Semantic Relations for Glass Surface Detection**
> Jiaying Lin, Yuen-Hei Yeung, Rynson W. H. Lau
> NeurIPS 2022
> [Paper](https://openreview.net/forum?id=WrIrYMCZgbb) · [Project Page](https://jiaying.link/neurips2022-gsds/) · [Dataset (GSD-S)](https://huggingface.co/datasets/garrying/GSD-S)
## Model Summary
GlassSemNet detects glass surfaces by exploiting semantic relations between the glass region and its surrounding scene context. It uses a dual-backbone design:
- **Spatial backbone (SegFormer)**: extracts multi-scale spatial features.
- **Semantic backbone (ResNet-50 + DeepLabV3+)**: encodes 43-class semantic scene features into compact per-class encodings.
- **Semantic-Aware Attention (SAA)**: fuses spatial and semantic features at three scales using the semantic encodings as guidance.
- **Cross-modal Context Aggregation (CCA)**: aggregates cross-scale context at the deepest level.
- **UPerNet decoder**: combines the fused multi-scale features into the final glass surface prediction.
| File | Description |
|------|-------------|
| `GlassSemNet.pth` | Best checkpoint (917 MB), saved as a raw `state_dict` |
## Loading the Weights
```python
import torch
from model.GlassSemNet import GlassSemNet # from the code release
model = GlassSemNet()
state_dict = torch.load("GlassSemNet.pth", map_location="cpu")
model.load_state_dict(state_dict)
model.eval()
```
Download the checkpoint:
```bash
huggingface-cli download garrying/GlassSemNet GlassSemNet.pth --local-dir ./weights
```
## Inference
```bash
python predict.py -c GlassSemNet.pth -i /path/to/images/ -o /path/to/output/
```
Images are resized to **384 × 384** internally. Predictions are post-processed with CRF refinement and thresholded to produce binary glass surface masks.
## Training Dataset
This model was trained and evaluated on **GSD-S**, the first glass surface detection dataset with semantic annotations:
- 4,519 images (3,511 train / 1,008 test) with binary glass masks, instance segmentation maps, and 43-class semantic labels
- Available at [garrying/GSD-S](https://huggingface.co/datasets/garrying/GSD-S)
## Citation
```bibtex
@article{neurips2022:gsds2022,
author = {Lin, Jiaying and Yeung, Yuen-Hei and Lau, Rynson W.H.},
title = {Exploiting Semantic Relations for Glass Surface Detection},
journal = {NeurIPS},
year = {2022},
}
```
## License
Non-commercial use only — [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).