nielsr HF Staff commited on
Commit
b565446
·
verified ·
1 Parent(s): 104459b

Link paper and GitHub repository

Browse files

This PR improves the model card by adding links to the official paper on Hugging Face Papers and the code repository on GitHub. These additions help researchers locate the original work and implementation more easily. I have also added a brief summary of the ProCap framework based on the paper abstract.

Files changed (1) hide show
  1. README.md +16 -10
README.md CHANGED
@@ -1,29 +1,35 @@
1
  ---
2
- license: mit
3
- tags:
4
- - change captioning
5
- - vision-language
6
- - image-to-text
7
- - procedural reasoning
8
- - multimodal
9
- - pytorch
10
  datasets:
11
  - clevr-change
12
  - image-editing-request
13
  - spot-the-diff
 
14
  metrics:
15
  - bleu
16
  - meteor
17
  - rouge
18
  pipeline_tag: image-to-text
 
 
 
 
 
 
 
19
  ---
20
 
21
- # ProCap: Experiment Materials
22
 
23
  This repository contains the **official experimental materials** for the paper:
24
 
25
  > **Imagine How to Change: Explicit Procedure Modeling for Change Captioning**
26
 
 
 
 
 
 
 
27
  It provides **processed datasets**, **pre-trained model weights**, and **evaluation tools** for reproducing the results reported in the paper.
28
 
29
  📦 All materials are also available via [Baidu Netdisk](https://pan.baidu.com/s/1t_YXB6J_vkuPxByn2hat2A)
@@ -135,7 +141,7 @@ If you find our work or this repository useful, please consider citing our paper
135
  @inproceedings{
136
  sun2026imagine,
137
  title={Imagine How To Change: Explicit Procedure Modeling for Change Captioning},
138
- author={Sun, Jiayang and Guo, Zixin and Cao, Min and Zhu, Guibo and Laaksonen, Jorma},
139
  booktitle={The Fourteenth International Conference on Learning Representations},
140
  year={2026},
141
  }
 
1
  ---
 
 
 
 
 
 
 
 
2
  datasets:
3
  - clevr-change
4
  - image-editing-request
5
  - spot-the-diff
6
+ license: mit
7
  metrics:
8
  - bleu
9
  - meteor
10
  - rouge
11
  pipeline_tag: image-to-text
12
+ tags:
13
+ - change captioning
14
+ - vision-language
15
+ - image-to-text
16
+ - procedural reasoning
17
+ - multimodal
18
+ - pytorch
19
  ---
20
 
21
+ # ProCap: Imagine How to Change
22
 
23
  This repository contains the **official experimental materials** for the paper:
24
 
25
  > **Imagine How to Change: Explicit Procedure Modeling for Change Captioning**
26
 
27
+ [[Paper](https://huggingface.co/papers/2603.05969)] [[Code](https://github.com/BlueberryOreo/ProCap)]
28
+
29
+ ProCap is a framework that reformulates change modeling from static image comparison to dynamic procedure modeling. It features a two-stage design:
30
+ 1. **Explicit Procedure Modeling**: Trains a procedure encoder to learn the change procedure from a sparse set of keyframes.
31
+ 2. **Implicit Procedure Captioning**: Integrates the trained encoder within an encoder-decoder model for captioning using learnable procedure queries.
32
+
33
  It provides **processed datasets**, **pre-trained model weights**, and **evaluation tools** for reproducing the results reported in the paper.
34
 
35
  📦 All materials are also available via [Baidu Netdisk](https://pan.baidu.com/s/1t_YXB6J_vkuPxByn2hat2A)
 
141
  @inproceedings{
142
  sun2026imagine,
143
  title={Imagine How To Change: Explicit Procedure Modeling for Change Captioning},
144
+ author={Sun, Jiayang and Guo, Zixin and Cao, Min and Zhu, Guibo evangelist and Laaksonen, Jorma},
145
  booktitle={The Fourteenth International Conference on Learning Representations},
146
  year={2026},
147
  }