Add library_name and pipeline_tag metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +35 -41
README.md CHANGED
@@ -1,83 +1,77 @@
1
  ---
2
- license: apache-2.0
3
- tags:
4
- - object-detection
5
- - region-proposal
6
- - open-set-detection
7
- - zero-shot-detection
8
- - mmdetection
9
- - pytorch
10
- - cvpr2026
11
  datasets:
12
  - coco
13
  - imagenet
14
  - cd-fsod
15
  - odinw
 
16
  metrics:
17
  - average-recall (AR)
 
 
 
 
 
 
 
 
18
  ---
19
 
20
  # PF-RPN: Prompt-Free Universal Region Proposal Network
21
 
22
- ## 🧠 Model Details
23
 
24
- **PF-RPN** (Prompt-Free Universal Region Proposal Network) is a state-of-the-art model for Cross-Domain Open-Set Region Proposal Network, accepted at **CVPR 2026**.
25
 
26
- Open-vocabulary detectors typically rely on text prompts (class names), which can be unavailable, noisy, or domain-sensitive during deployment. PF-RPN tackles this by revisiting region proposal generation under a strictly **prompt-free** setting. Instead of specific category names, all categories are unified into a single token (`object`).
 
 
27
 
28
  ### Model Architecture Innovations
29
  To improve proposal quality without explicit class prompts, PF-RPN introduces three key designs:
30
- 1. **Sparse Image-Aware Adapter:** Constructs pseudo-text representations from multi-level visual features.
31
- 2. **Cascade Self-Prompt:** Iteratively enhances visual-text alignments via masked pooling.
32
- 3. **Centerness-Guided Query Selection:** Selects top-k decoder queries using joint confidence scores.
33
 
34
  ### Model Sources
35
  - **Repository:** [PF-RPN GitHub Repository](https://github.com/tangqh03/PF-RPN)
36
- - **Paper:** PF-RPN: Prompt-Free Universal Region Proposal Network (CVPR 2026)
37
  - **Base Framework:** [MMDetection 3.3.0](https://github.com/open-mmlab/mmdetection)
38
  - **Backbone:** Swin-Base (`swinb`)
39
 
40
  ## 🎯 Intended Use
41
 
42
  - **Primary Use Case:** Generating high-quality, class-agnostic region proposals ("objects") across diverse, unseen domains without requiring domain-specific text prompts or retraining.
43
- - **Protocol:** Strict one-class open-set setup where `custom_classes = ('object',)`.
44
 
45
  ## 🗂️ Training Data
46
 
47
- The provided checkpoint (`pf_rpn_swinb_5p_coco_imagenet.pth`) was trained on a combined dataset of **COCO 2017** and **ImageNet-1k**.
48
  - To simulate the open-set proposal generation task, all ground-truth categories are merged into a single class (`object`).
49
- - The specific released model uses a **5% subset** of the COCO training data merged with ImageNet images.
50
 
51
- ## 📊 Evaluation Data and Performance
52
 
53
  PF-RPN achieves state-of-the-art Average Recall (AR) under prompt-free evaluation across multiple benchmarks.
54
 
55
  ### Cross-Domain Few-Shot Object Detection (CD-FSOD)
56
- Evaluated across 6 target domains (ArTaxOr, clipart1k, DIOR, FISH, NEUDET, UODD).
57
-
58
- | Method | Prompt Free | AR100 | AR300 | AR900 | ARs | ARm | ARl |
59
- |---|:---:|---:|---:|---:|---:|---:|---:|
60
- | GDINO‡ | ✓ | 54.7 | 57.8 | 61.6 | 34.1 | 49.3 | 67.0 |
61
- | GenerateU | ✓ | 47.7 | 54.1 | 55.7 | 28.1 | 48.3 | 69.4 |
62
- | Cascade RPN | ✓ | 45.8 | 52.0 | 56.9 | 31.1 | 50.5 | 66.0 |
63
- | **PF-RPN (Ours)** | **✓** | **60.7** | **65.3** | **68.2** | **38.5** | **61.9** | **80.3** |
64
 
65
  ### Object Detection in the Wild (ODinW13)
66
- Evaluated across 13 diverse target domains.
67
-
68
- | Method | Prompt Free | AR100 | AR300 | AR900 | ARs | ARm | ARl |
69
- |---|:---:|---:|---:|---:|---:|---:|---:|
70
- | GDINO‡ | ✓ | 69.1 | 70.9 | 72.4 | 40.8 | 64.6 | 78.4 |
71
- | GenerateU | ✓ | 67.3 | 71.5 | 72.2 | 32.8 | 63.1 | 80.0 |
72
- | Cascade RPN | ✓ | 60.9 | 65.5 | 70.2 | 40.3 | 65.5 | 75.0 |
73
- | **PF-RPN (Ours)** | **✓** | **76.5** | **78.6** | **79.8** | 45.4 | **71.9** | **85.8** |
74
 
75
  *(‡ indicates models where original class names were replaced with `object` to simulate a prompt-free setting).*
76
 
77
  ## ⚙️ How to Use
78
 
79
  ### Installation
80
- Ensure you have a working environment with Python 3.10, PyTorch 2.1.0, and CUDA 11.8. Install MMDetection and this repository's codebase as described in the [GitHub README](https://github.com/tangqh03/PF-RPN#%EF%B8%8F-installation).
81
 
82
  ### Quick Start: Evaluation
83
 
@@ -87,19 +81,19 @@ mkdir -p checkpoints
87
 
88
  # Download GroundingDINO base weights
89
  wget -O checkpoints/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth \
90
- [https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth)
91
 
92
  # Download PF-RPN weights
93
  wget -O checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth \
94
- [https://huggingface.co/tangqh/PF-RPN/resolve/main/pf_rpn_swinb_5p_coco_imagenet.pth](https://huggingface.co/tangqh/PF-RPN/resolve/main/pf_rpn_swinb_5p_coco_imagenet.pth)
 
95
 
96
- 2. **Run Inference / Testing**
97
  ```bash
98
  python tools/test.py \
99
  configs/pf-rpn/pf-rpn_coco-imagenet.py \
100
  checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth
101
  ```
102
- Note: Data preprocessing is required before evaluation. Datasets must be downloaded and their categories merged into a single `object` class using the provided `tools/merge_classes_and_sample_subset.py` script. See the repository for detailed data preparation commands.
103
 
104
  ## 📚 Citation
105
  If you use PF-RPN in your research, please cite:
 
1
  ---
 
 
 
 
 
 
 
 
 
2
  datasets:
3
  - coco
4
  - imagenet
5
  - cd-fsod
6
  - odinw
7
+ license: apache-2.0
8
  metrics:
9
  - average-recall (AR)
10
+ library_name: mmdetection
11
+ pipeline_tag: object-detection
12
+ tags:
13
+ - region-proposal
14
+ - open-set-detection
15
+ - zero-shot-detection
16
+ - pytorch
17
+ - cvpr2026
18
  ---
19
 
20
  # PF-RPN: Prompt-Free Universal Region Proposal Network
21
 
22
+ This is the official implementation of **PF-RPN**, a state-of-the-art model for Cross-Domain Open-Set Region Proposal generation, accepted at **CVPR 2026**.
23
 
24
+ [**Paper**](https://huggingface.co/papers/2603.17554) | [**GitHub Repository**](https://github.com/tangqh03/PF-RPN)
25
 
26
+ ## 🧠 Model Details
27
+
28
+ **PF-RPN** (Prompt-Free Universal Region Proposal Network) identifies potential objects without relying on external prompts (like class names, exemplar images, or textual descriptions). Instead of specific category names, all categories are unified into a single learnable token (`object`).
29
 
30
  ### Model Architecture Innovations
31
  To improve proposal quality without explicit class prompts, PF-RPN introduces three key designs:
32
+ 1. **Sparse Image-Aware Adapter (SIA):** Performs initial localization of potential objects using a learnable query embedding dynamically updated with visual features.
33
+ 2. **Cascade Self-Prompt (CSP):** Identifies remaining objects by leveraging self-prompted learnable embeddings, autonomously aggregating informative visual features in a cascading manner.
34
+ 3. **Centerness-Guided Query Selection (CG-QS):** Facilitates the selection of high-quality query embeddings using a centerness scoring network.
35
 
36
  ### Model Sources
37
  - **Repository:** [PF-RPN GitHub Repository](https://github.com/tangqh03/PF-RPN)
 
38
  - **Base Framework:** [MMDetection 3.3.0](https://github.com/open-mmlab/mmdetection)
39
  - **Backbone:** Swin-Base (`swinb`)
40
 
41
  ## 🎯 Intended Use
42
 
43
  - **Primary Use Case:** Generating high-quality, class-agnostic region proposals ("objects") across diverse, unseen domains without requiring domain-specific text prompts or retraining.
44
+ - **Applications:** Underwater object detection, industrial defect detection, and remote sensing image object detection.
45
 
46
  ## 🗂️ Training Data
47
 
48
+ The provided checkpoint (`pf_rpn_swinb_5p_coco_imagenet.pth`) was trained on a combined dataset of **COCO 2017** (5% subset) and **ImageNet-1k**.
49
  - To simulate the open-set proposal generation task, all ground-truth categories are merged into a single class (`object`).
 
50
 
51
+ ## 📊 Performance
52
 
53
  PF-RPN achieves state-of-the-art Average Recall (AR) under prompt-free evaluation across multiple benchmarks.
54
 
55
  ### Cross-Domain Few-Shot Object Detection (CD-FSOD)
56
+ | Method | Prompt Free | AR100 | AR300 | AR900 |
57
+ |---|:---:|---:|---:|---:|
58
+ | GDINO‡ | | 54.7 | 57.8 | 61.6 |
59
+ | GenerateU || 47.7 | 54.1 | 55.7 |
60
+ | **PF-RPN (Ours)** | **** | **60.7** | **65.3** | **68.2** |
 
 
 
61
 
62
  ### Object Detection in the Wild (ODinW13)
63
+ | Method | Prompt Free | AR100 | AR300 | AR900 |
64
+ |---|:---:|---:|---:|---:|
65
+ | GDINO‡ | | 69.1 | 70.9 | 72.4 |
66
+ | GenerateU || 67.3 | 71.5 | 72.2 |
67
+ | **PF-RPN (Ours)** | **** | **76.5** | **78.6** | **79.8** |
 
 
 
68
 
69
  *(‡ indicates models where original class names were replaced with `object` to simulate a prompt-free setting).*
70
 
71
  ## ⚙️ How to Use
72
 
73
  ### Installation
74
+ The codebase is built on MMDetection. Please follow the [installation instructions](https://github.com/tangqh03/PF-RPN#%EF%B8%8F-installation) in the official repository.
75
 
76
  ### Quick Start: Evaluation
77
 
 
81
 
82
  # Download GroundingDINO base weights
83
  wget -O checkpoints/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth \
84
+ https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth
85
 
86
  # Download PF-RPN weights
87
  wget -O checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth \
88
+ https://huggingface.co/tangqh/PF-RPN/resolve/main/pf_rpn_swinb_5p_coco_imagenet.pth
89
+ ```
90
 
91
+ 2. **Run Testing**
92
  ```bash
93
  python tools/test.py \
94
  configs/pf-rpn/pf-rpn_coco-imagenet.py \
95
  checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth
96
  ```
 
97
 
98
  ## 📚 Citation
99
  If you use PF-RPN in your research, please cite: