garrying
/

GlassSemNet

glass-surface-detection

semantic-segmentation

scene-understanding

Model card Files Files and versions

GlassSemNet / README.md

garrying's picture

Upload README.md with huggingface_hub

a88d290 verified 5 days ago

|

history blame contribute delete

2.81 kB

	---
	license: cc-by-nc-4.0
	task_categories:
	- image-segmentation
	tags:
	- glass-surface-detection
	- semantic-segmentation
	- scene-understanding
	- pytorch
	pretty_name: GlassSemNet (Glass Semantic Network)
	---

	# GlassSemNet — Glass Semantic Network

	Pre-trained weights for GlassSemNet, introduced in:

	> Exploiting Semantic Relations for Glass Surface Detection
	> Jiaying Lin, Yuen-Hei Yeung, Rynson W. H. Lau
	> NeurIPS 2022
	> [Paper](https://openreview.net/forum?id=WrIrYMCZgbb) · [Project Page](https://jiaying.link/neurips2022-gsds/) · [Dataset (GSD-S)](https://huggingface.co/datasets/garrying/GSD-S)

	## Model Summary

	GlassSemNet detects glass surfaces by exploiting semantic relations between the glass region and its surrounding scene context. It uses a dual-backbone design:

	- Spatial backbone (SegFormer): extracts multi-scale spatial features.
	- Semantic backbone (ResNet-50 + DeepLabV3+): encodes 43-class semantic scene features into compact per-class encodings.
	- Semantic-Aware Attention (SAA): fuses spatial and semantic features at three scales using the semantic encodings as guidance.
	- Cross-modal Context Aggregation (CCA): aggregates cross-scale context at the deepest level.
	- UPerNet decoder: combines the fused multi-scale features into the final glass surface prediction.

	\| File \| Description \|
	\|------\|-------------\|
	\| `GlassSemNet.pth` \| Best checkpoint (917 MB), saved as a raw `state_dict` \|

	## Loading the Weights

	```python
	import torch
	from model.GlassSemNet import GlassSemNet # from the code release

	model = GlassSemNet()
	state_dict = torch.load("GlassSemNet.pth", map_location="cpu")
	model.load_state_dict(state_dict)
	model.eval()
	```

	Download the checkpoint:
	```bash
	huggingface-cli download garrying/GlassSemNet GlassSemNet.pth --local-dir ./weights
	```

	## Inference

	```bash
	python predict.py -c GlassSemNet.pth -i /path/to/images/ -o /path/to/output/
	```

	Images are resized to 384 × 384 internally. Predictions are post-processed with CRF refinement and thresholded to produce binary glass surface masks.

	## Training Dataset

	This model was trained and evaluated on GSD-S, the first glass surface detection dataset with semantic annotations:

	- 4,519 images (3,511 train / 1,008 test) with binary glass masks, instance segmentation maps, and 43-class semantic labels
	- Available at [garrying/GSD-S](https://huggingface.co/datasets/garrying/GSD-S)

	## Citation

	```bibtex
	@article{neurips2022:gsds2022,
	author = {Lin, Jiaying and Yeung, Yuen-Hei and Lau, Rynson W.H.},
	title = {Exploiting Semantic Relations for Glass Surface Detection},
	journal = {NeurIPS},
	year = {2022},
	}
	```

	## License

	Non-commercial use only — [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).