Add model card and metadata

Hi! I'm Niels from the community science team at Hugging Face. I've improved the model card for this repository by adding relevant metadata and structured information.

Specifically, this PR:
- Adds the `text-generation` pipeline tag for better discoverability.
- Adds `library_name: transformers` metadata based on the `config.json`.
- Links the repository to the [TAPS paper](https://huggingface.co/papers/2603.27027).
- Adds a link to the official GitHub repository for code and setup instructions.
- Includes the BibTeX citation for proper attribution.

This will make the model easier to find and use for the community!

Files changed (1) hide show

README.md +29 -1

README.md CHANGED Viewed

	@@ -1 +1,29 @@
1	- ~~arxiv.org/abs/2603.27027~~

+---
+library_name: transformers
+pipeline_tag: text-generation
+---
+# TAPS: Task-Aware Proposal Distributions for Speculative Sampling
+This repository contains draft models from the paper [TAPS: Task Aware Proposal Distributions for Speculative Sampling](https://huggingface.co/papers/2603.27027).
+Speculative decoding accelerates autoregressive generation by letting a lightweight draft model propose future tokens that a larger target model then verifies in parallel. TAPS studies how the draft training distribution (e.g., MathInstruct, ShareGPT) shapes speculative decoding quality.
+- **Paper:** [TAPS: Task Aware Proposal Distributions for Speculative Sampling](https://huggingface.co/papers/2603.27027)
+- **Repository:** [GitHub - Moe-Zbeeb/TAPS](https://github.com/Moe-Zbeeb/TAPS)
+## Abstract
+Speculative decoding speeds up autoregressive generation by letting a lightweight drafter propose tokens that a larger verifier checks in parallel. We study how much draft quality depends on the training distribution using HASS and EAGLE-2 drafts trained on MathInstruct, ShareGPT, and mixed variants. Task-matched drafts specialize; mixed data aids robustness but is not uniformly dominant across temperatures. Results show speculative decoding quality hinges on both draft architecture and the alignment between draft training data and downstream workload.
+## Model Description
+This specific checkpoint is a lightweight LLaMA-style drafter (typically 1 layer, ~0.8B parameters) designed to be used in a speculative decoding pipeline, for example with `Meta-Llama-3-8B-Instruct` as the verifier.
+## Citation
+```bibtex
+@article{zbib2026taps,
+  title={TAPS: Task Aware Proposal Distributions for Speculative Sampling},
+  author={Zbib, Mohamad and Bazzi, Mohamad and Mohanna, Ammar and Ghanem, Bernard and Hammoud, Hasan Abed Al Kader},
+  year={2026},
+  note={Technical report}
+}
+```