nielsr HF Staff commited on
Commit
3264d69
·
verified ·
1 Parent(s): 0b48079

Add model card for GDSD

Browse files

This PR adds a model card for the GDSD checkpoint. It includes:
- Metadata for `pipeline_tag` and `library_name`.
- Links to the paper [GDSD: Reinforcement Learning as Guided Denoiser Self-Distillation for Diffusion Language Models](https://huggingface.co/papers/2605.29398).
- A link to the official GitHub repository.
- The BibTeX citation for researchers.

Files changed (1) hide show
  1. README.md +29 -0
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ pipeline_tag: text-generation
4
+ ---
5
+
6
+ # GDSD: Reinforcement Learning as Guided Denoiser Self-Distillation for Diffusion Language Models
7
+
8
+ This repository contains the model weights for GDSD (Guided Denoiser Self-Distillation), as presented in the paper [GDSD: Reinforcement Learning as Guided Denoiser Self-Distillation for Diffusion Language Models](https://arxiv.org/abs/2605.29398).
9
+
10
+ GDSD is a reinforcement learning (RL) framework for diffusion large language models (dLLMs) that bypasses the intractability of policy likelihood. It distills the denoiser of dLLMs from an advantage-guided self-teacher derived from the closed-form optimum of reverse-KL regularized RL. This method avoids the Training-Inference Mismatch (TIM) biases common in ELBO-based approaches, leading to more stable training and improved performance on planning, math, and coding benchmarks.
11
+
12
+ - **Paper:** [https://arxiv.org/abs/2605.29398](https://arxiv.org/abs/2605.29398)
13
+ - **Repository:** [https://github.com/GaryBall/GDSD](https://github.com/GaryBall/GDSD)
14
+
15
+ ## Citation
16
+
17
+ If you find GDSD helpful, please consider citing:
18
+
19
+ ```bibtex
20
+ @misc{tang2026gdsdreinforcementlearningguided,
21
+ title={GDSD: Reinforcement Learning as Guided Denoiser Self-Distillation for Diffusion Language Models},
22
+ author={Xiaohang Tang and Keyue Jiang and Che Liu and Qifang Zhao and Xiaoxiao Xu and Sangwoong Yoon and Ilija Bogunovic},
23
+ year={2026},
24
+ eprint={2605.29398},
25
+ archivePrefix={arXiv},
26
+ primaryClass={cs.LG},
27
+ url={https://arxiv.org/abs/2605.29398},
28
+ }
29
+ ```