garrying commited on
Commit
a88d290
·
verified ·
1 Parent(s): c818279

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +81 -0
README.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ task_categories:
4
+ - image-segmentation
5
+ tags:
6
+ - glass-surface-detection
7
+ - semantic-segmentation
8
+ - scene-understanding
9
+ - pytorch
10
+ pretty_name: GlassSemNet (Glass Semantic Network)
11
+ ---
12
+
13
+ # GlassSemNet — Glass Semantic Network
14
+
15
+ Pre-trained weights for **GlassSemNet**, introduced in:
16
+
17
+ > **Exploiting Semantic Relations for Glass Surface Detection**
18
+ > Jiaying Lin, Yuen-Hei Yeung, Rynson W. H. Lau
19
+ > NeurIPS 2022
20
+ > [Paper](https://openreview.net/forum?id=WrIrYMCZgbb) · [Project Page](https://jiaying.link/neurips2022-gsds/) · [Dataset (GSD-S)](https://huggingface.co/datasets/garrying/GSD-S)
21
+
22
+ ## Model Summary
23
+
24
+ GlassSemNet detects glass surfaces by exploiting semantic relations between the glass region and its surrounding scene context. It uses a dual-backbone design:
25
+
26
+ - **Spatial backbone (SegFormer)**: extracts multi-scale spatial features.
27
+ - **Semantic backbone (ResNet-50 + DeepLabV3+)**: encodes 43-class semantic scene features into compact per-class encodings.
28
+ - **Semantic-Aware Attention (SAA)**: fuses spatial and semantic features at three scales using the semantic encodings as guidance.
29
+ - **Cross-modal Context Aggregation (CCA)**: aggregates cross-scale context at the deepest level.
30
+ - **UPerNet decoder**: combines the fused multi-scale features into the final glass surface prediction.
31
+
32
+ | File | Description |
33
+ |------|-------------|
34
+ | `GlassSemNet.pth` | Best checkpoint (917 MB), saved as a raw `state_dict` |
35
+
36
+ ## Loading the Weights
37
+
38
+ ```python
39
+ import torch
40
+ from model.GlassSemNet import GlassSemNet # from the code release
41
+
42
+ model = GlassSemNet()
43
+ state_dict = torch.load("GlassSemNet.pth", map_location="cpu")
44
+ model.load_state_dict(state_dict)
45
+ model.eval()
46
+ ```
47
+
48
+ Download the checkpoint:
49
+ ```bash
50
+ huggingface-cli download garrying/GlassSemNet GlassSemNet.pth --local-dir ./weights
51
+ ```
52
+
53
+ ## Inference
54
+
55
+ ```bash
56
+ python predict.py -c GlassSemNet.pth -i /path/to/images/ -o /path/to/output/
57
+ ```
58
+
59
+ Images are resized to **384 × 384** internally. Predictions are post-processed with CRF refinement and thresholded to produce binary glass surface masks.
60
+
61
+ ## Training Dataset
62
+
63
+ This model was trained and evaluated on **GSD-S**, the first glass surface detection dataset with semantic annotations:
64
+
65
+ - 4,519 images (3,511 train / 1,008 test) with binary glass masks, instance segmentation maps, and 43-class semantic labels
66
+ - Available at [garrying/GSD-S](https://huggingface.co/datasets/garrying/GSD-S)
67
+
68
+ ## Citation
69
+
70
+ ```bibtex
71
+ @article{neurips2022:gsds2022,
72
+ author = {Lin, Jiaying and Yeung, Yuen-Hei and Lau, Rynson W.H.},
73
+ title = {Exploiting Semantic Relations for Glass Surface Detection},
74
+ journal = {NeurIPS},
75
+ year = {2022},
76
+ }
77
+ ```
78
+
79
+ ## License
80
+
81
+ Non-commercial use only — [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).