tangqh
/

PF-RPN

@@ -1,83 +1,77 @@
 ---
-license: apache-2.0
-tags:
-- object-detection
-- region-proposal
-- open-set-detection
-- zero-shot-detection
-- mmdetection
-- pytorch
-- cvpr2026
 datasets:
 - coco
 - imagenet
 - cd-fsod
 - odinw
 metrics:
 - average-recall (AR)
 ---
 # PF-RPN: Prompt-Free Universal Region Proposal Network
-## 🧠 Model Details
-**PF-RPN** (Prompt-Free Universal Region Proposal Network) is a state-of-the-art model for Cross-Domain Open-Set Region Proposal Network, accepted at **CVPR 2026**.
-Open-vocabulary detectors typically rely on text prompts (class names), which can be unavailable, noisy, or domain-sensitive during deployment. PF-RPN tackles this by revisiting region proposal generation under a strictly **prompt-free** setting. Instead of specific category names, all categories are unified into a single token (`object`).
 ### Model Architecture Innovations
 To improve proposal quality without explicit class prompts, PF-RPN introduces three key designs:
-1. **Sparse Image-Aware Adapter:** Constructs pseudo-text representations from multi-level visual features.
-2. **Cascade Self-Prompt:** Iteratively enhances visual-text alignments via masked pooling.
-3. **Centerness-Guided Query Selection:** Selects top-k decoder queries using joint confidence scores.
 ### Model Sources
 - **Repository:** [PF-RPN GitHub Repository](https://github.com/tangqh03/PF-RPN)
-- **Paper:** PF-RPN: Prompt-Free Universal Region Proposal Network (CVPR 2026)
 - **Base Framework:** [MMDetection 3.3.0](https://github.com/open-mmlab/mmdetection)
 - **Backbone:** Swin-Base (`swinb`)
 ## 🎯 Intended Use
 - **Primary Use Case:** Generating high-quality, class-agnostic region proposals ("objects") across diverse, unseen domains without requiring domain-specific text prompts or retraining.
-- **Protocol:** Strict one-class open-set setup where `custom_classes = ('object',)`.
 ## 🗂️ Training Data
-The provided checkpoint (`pf_rpn_swinb_5p_coco_imagenet.pth`) was trained on a combined dataset of **COCO 2017** and **ImageNet-1k**.
 - To simulate the open-set proposal generation task, all ground-truth categories are merged into a single class (`object`).
-- The specific released model uses a **5% subset** of the COCO training data merged with ImageNet images.
-## 📊 Evaluation Data and Performance
 PF-RPN achieves state-of-the-art Average Recall (AR) under prompt-free evaluation across multiple benchmarks.
 ### Cross-Domain Few-Shot Object Detection (CD-FSOD)
-Evaluated across 6 target domains (ArTaxOr, clipart1k, DIOR, FISH, NEUDET, UODD).
-| Method | Prompt Free | AR100 | AR300 | AR900 | ARs | ARm | ARl |
-|---|:---:|---:|---:|---:|---:|---:|---:|
-| GDINO‡ | ✓ | 54.7 | 57.8 | 61.6 | 34.1 | 49.3 | 67.0 |
-| GenerateU | ✓ | 47.7 | 54.1 | 55.7 | 28.1 | 48.3 | 69.4 |
-| Cascade RPN | ✓ | 45.8 | 52.0 | 56.9 | 31.1 | 50.5 | 66.0 |
-| **PF-RPN (Ours)** | **✓** | **60.7** | **65.3** | **68.2** | **38.5** | **61.9** | **80.3** |
 ### Object Detection in the Wild (ODinW13)
-Evaluated across 13 diverse target domains.
-| Method | Prompt Free | AR100 | AR300 | AR900 | ARs | ARm | ARl |
-|---|:---:|---:|---:|---:|---:|---:|---:|
-| GDINO‡ | ✓ | 69.1 | 70.9 | 72.4 | 40.8 | 64.6 | 78.4 |
-| GenerateU | ✓ | 67.3 | 71.5 | 72.2 | 32.8 | 63.1 | 80.0 |
-| Cascade RPN | ✓ | 60.9 | 65.5 | 70.2 | 40.3 | 65.5 | 75.0 |
-| **PF-RPN (Ours)** | **✓** | **76.5** | **78.6** | **79.8** | 45.4 | **71.9** | **85.8** |
 *(‡ indicates models where original class names were replaced with `object` to simulate a prompt-free setting).*
 ## ⚙️ How to Use
 ### Installation
-Ensure you have a working environment with Python 3.10, PyTorch 2.1.0, and CUDA 11.8. Install MMDetection and this repository's codebase as described in the [GitHub README](https://github.com/tangqh03/PF-RPN#%EF%B8%8F-installation).
 ### Quick Start: Evaluation
@@ -87,19 +81,19 @@ mkdir -p checkpoints
 # Download GroundingDINO base weights
 wget -O checkpoints/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth \
-  [https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth)
 # Download PF-RPN weights
 wget -O checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth \
-  [https://huggingface.co/tangqh/PF-RPN/resolve/main/pf_rpn_swinb_5p_coco_imagenet.pth](https://huggingface.co/tangqh/PF-RPN/resolve/main/pf_rpn_swinb_5p_coco_imagenet.pth)
-2. **Run Inference / Testing**
 ```bash
 python tools/test.py \
   configs/pf-rpn/pf-rpn_coco-imagenet.py \
   checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth
 ```
-Note: Data preprocessing is required before evaluation. Datasets must be downloaded and their categories merged into a single `object` class using the provided `tools/merge_classes_and_sample_subset.py` script. See the repository for detailed data preparation commands.
 ## 📚 Citation
 If you use PF-RPN in your research, please cite:

 ---
 datasets:
 - coco
 - imagenet
 - cd-fsod
 - odinw
+license: apache-2.0
 metrics:
 - average-recall (AR)
+library_name: mmdetection
+pipeline_tag: object-detection
+tags:
+- region-proposal
+- open-set-detection
+- zero-shot-detection
+- pytorch
+- cvpr2026
 ---
 # PF-RPN: Prompt-Free Universal Region Proposal Network
+This is the official implementation of **PF-RPN**, a state-of-the-art model for Cross-Domain Open-Set Region Proposal generation, accepted at **CVPR 2026**.
+[**Paper**](https://huggingface.co/papers/2603.17554) | [**GitHub Repository**](https://github.com/tangqh03/PF-RPN)
+## 🧠 Model Details
+**PF-RPN** (Prompt-Free Universal Region Proposal Network) identifies potential objects without relying on external prompts (like class names, exemplar images, or textual descriptions). Instead of specific category names, all categories are unified into a single learnable token (`object`).
 ### Model Architecture Innovations
 To improve proposal quality without explicit class prompts, PF-RPN introduces three key designs:
+1. **Sparse Image-Aware Adapter (SIA):** Performs initial localization of potential objects using a learnable query embedding dynamically updated with visual features.
+2. **Cascade Self-Prompt (CSP):** Identifies remaining objects by leveraging self-prompted learnable embeddings, autonomously aggregating informative visual features in a cascading manner.
+3. **Centerness-Guided Query Selection (CG-QS):** Facilitates the selection of high-quality query embeddings using a centerness scoring network.
 ### Model Sources
 - **Repository:** [PF-RPN GitHub Repository](https://github.com/tangqh03/PF-RPN)
 - **Base Framework:** [MMDetection 3.3.0](https://github.com/open-mmlab/mmdetection)
 - **Backbone:** Swin-Base (`swinb`)
 ## 🎯 Intended Use
 - **Primary Use Case:** Generating high-quality, class-agnostic region proposals ("objects") across diverse, unseen domains without requiring domain-specific text prompts or retraining.
+- **Applications:** Underwater object detection, industrial defect detection, and remote sensing image object detection.
 ## 🗂️ Training Data
+The provided checkpoint (`pf_rpn_swinb_5p_coco_imagenet.pth`) was trained on a combined dataset of **COCO 2017** (5% subset) and **ImageNet-1k**.
 - To simulate the open-set proposal generation task, all ground-truth categories are merged into a single class (`object`).
+## 📊 Performance
 PF-RPN achieves state-of-the-art Average Recall (AR) under prompt-free evaluation across multiple benchmarks.
 ### Cross-Domain Few-Shot Object Detection (CD-FSOD)
+| Method | Prompt Free | AR100 | AR300 | AR900 |
+|---|:---:|---:|---:|---:|
+| GDINO‡ | ✓ | 54.7 | 57.8 | 61.6 |
+| GenerateU | ✓ | 47.7 | 54.1 | 55.7 |
+| **PF-RPN (Ours)** | **✓** | **60.7** | **65.3** | **68.2** |
 ### Object Detection in the Wild (ODinW13)
+| Method | Prompt Free | AR100 | AR300 | AR900 |
+|---|:---:|---:|---:|---:|
+| GDINO‡ | ✓ | 69.1 | 70.9 | 72.4 |
+| GenerateU | ✓ | 67.3 | 71.5 | 72.2 |
+| **PF-RPN (Ours)** | **✓** | **76.5** | **78.6** | **79.8** |
 *(‡ indicates models where original class names were replaced with `object` to simulate a prompt-free setting).*
 ## ⚙️ How to Use
 ### Installation
+The codebase is built on MMDetection. Please follow the [installation instructions](https://github.com/tangqh03/PF-RPN#%EF%B8%8F-installation) in the official repository.
 ### Quick Start: Evaluation
 # Download GroundingDINO base weights
 wget -O checkpoints/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth \
+  https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth
 # Download PF-RPN weights
 wget -O checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth \
+  https://huggingface.co/tangqh/PF-RPN/resolve/main/pf_rpn_swinb_5p_coco_imagenet.pth
+```
+2. **Run Testing**
 ```bash
 python tools/test.py \
   configs/pf-rpn/pf-rpn_coco-imagenet.py \
   checkpoints/pf_rpn_swinb_5p_coco_imagenet.pth
 ```
 ## 📚 Citation
 If you use PF-RPN in your research, please cite: