| | ---
|
| | base_model: Qwen/Qwen2.5-32B-Instruct
|
| | library_name: transformers
|
| | model_name: step-conditional-control
|
| | tags:
|
| | - generated_from_trainer
|
| | - trl
|
| | - sft
|
| | license: apache-2.0
|
| | language:
|
| | - zho
|
| | - eng
|
| | - fra
|
| | - spa
|
| | - por
|
| | - deu
|
| | - ita
|
| | - rus
|
| | - jpn
|
| | - kor
|
| | - vie
|
| | - tha
|
| | - ara
|
| | ---
|
| |
|
| | # Model Summary
|
| |
|
| | - **Repository:** [simplescaling/s1](https://github.com/simplescaling/s1)
|
| | - **Paper:** https://arxiv.org/abs/2501.19393
|
| |
|
| | # Use
|
| |
|
| | This is the token-conditional control model for our paper. You can evaluate using the information [here](https://github.com/simplescaling/s1?tab=readme-ov-file#evaluation).
|
| |
|
| | # Training information
|
| |
|
| | [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/hashimoto-group/o1/runs/xaantfal)
|
| |
|
| | - TRL: 0.13.0
|
| | - Transformers: 4.48.0
|
| | - Pytorch: 2.3.1
|
| | - Datasets: 3.0.1
|
| | - Tokenizers: 0.21.0
|
| |
|
| | # Citation
|
| |
|
| | ```bibtex
|
| | @misc{muennighoff2025s1simpletesttimescaling,
|
| | title={s1: Simple test-time scaling},
|
| | author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Candès and Tatsunori Hashimoto},
|
| | year={2025},
|
| | eprint={2501.19393},
|
| | archivePrefix={arXiv},
|
| | primaryClass={cs.CL},
|
| | url={https://arxiv.org/abs/2501.19393},
|
| | }
|
| | ``` |