nielsr HF Staff commited on
Commit
6cecca7
·
verified ·
1 Parent(s): 16244ee

Add model card and metadata

Browse files

Hi! I'm Niels from the community science team at Hugging Face. I've improved the model card for this repository by adding relevant metadata and structured information.

Specifically, this PR:
- Adds the `text-generation` pipeline tag for better discoverability.
- Adds `library_name: transformers` metadata based on the `config.json`.
- Links the repository to the [TAPS paper](https://huggingface.co/papers/2603.27027).
- Adds a link to the official GitHub repository for code and setup instructions.
- Includes the BibTeX citation for proper attribution.

This will make the model easier to find and use for the community!

Files changed (1) hide show
  1. README.md +29 -1
README.md CHANGED
@@ -1 +1,29 @@
1
- arxiv.org/abs/2603.27027
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ pipeline_tag: text-generation
4
+ ---
5
+
6
+ # TAPS: Task-Aware Proposal Distributions for Speculative Sampling
7
+
8
+ This repository contains draft models from the paper [TAPS: Task Aware Proposal Distributions for Speculative Sampling](https://huggingface.co/papers/2603.27027).
9
+
10
+ Speculative decoding accelerates autoregressive generation by letting a lightweight draft model propose future tokens that a larger target model then verifies in parallel. TAPS studies how the draft training distribution (e.g., MathInstruct, ShareGPT) shapes speculative decoding quality.
11
+
12
+ - **Paper:** [TAPS: Task Aware Proposal Distributions for Speculative Sampling](https://huggingface.co/papers/2603.27027)
13
+ - **Repository:** [GitHub - Moe-Zbeeb/TAPS](https://github.com/Moe-Zbeeb/TAPS)
14
+
15
+ ## Abstract
16
+ Speculative decoding speeds up autoregressive generation by letting a lightweight drafter propose tokens that a larger verifier checks in parallel. We study how much draft quality depends on the training distribution using HASS and EAGLE-2 drafts trained on MathInstruct, ShareGPT, and mixed variants. Task-matched drafts specialize; mixed data aids robustness but is not uniformly dominant across temperatures. Results show speculative decoding quality hinges on both draft architecture and the alignment between draft training data and downstream workload.
17
+
18
+ ## Model Description
19
+ This specific checkpoint is a lightweight LLaMA-style drafter (typically 1 layer, ~0.8B parameters) designed to be used in a speculative decoding pipeline, for example with `Meta-Llama-3-8B-Instruct` as the verifier.
20
+
21
+ ## Citation
22
+ ```bibtex
23
+ @article{zbib2026taps,
24
+ title={TAPS: Task Aware Proposal Distributions for Speculative Sampling},
25
+ author={Zbib, Mohamad and Bazzi, Mohamad and Mohanna, Ammar and Ghanem, Bernard and Hammoud, Hasan Abed Al Kader},
26
+ year={2026},
27
+ note={Technical report}
28
+ }
29
+ ```