arxiv:2402.17139
Sherry Yang
sherryy
AI & ML interests
None yet
Organizations
None yet
models 10
sherryy/Qwen2-0.5B-GRPO-test
Updated
sherryy/best5-next10-nopizza-nonomad_sft_90
Text Generation • 8B • Updated
sherryy/pizza_rwr_2k-1k
Text Generation • 8B • Updated
sherryy/pizza_rwr_k10_iter1
Text Generation • 8B • Updated • 4
sherryy/pizza_rwr_iter1
Text Generation • 8B • Updated
sherryy/pizza_rwr_k10
Text Generation • 8B • Updated
sherryy/pizza_rwr
Text Generation • 8B • Updated
sherryy/pizza_sft_90
Text Generation • 8B • Updated • 2
sherryy/pizza_sft
Text Generation • 8B • Updated
sherryy/math-baseline
Text Generation • 8B • Updated
datasets 14
sherryy/best5-next10-nopizza-nonomad_sft_90
Viewer • Updated • 78.6k • 29
sherryy/pizza_rwr_k10_iter1
Viewer • Updated • 24.4k • 13
sherryy/pizza_rwr_iter1
Viewer • Updated • 42.4k • 4
sherryy/pizza_rwr
Viewer • Updated • 83k • 26
sherryy/tree_dataset
Viewer • Updated • 11.1k • 8
sherryy/pizza_sft
Viewer • Updated • 37.8k • 27
sherryy/pizza_dpo
Viewer • Updated • 5.61k • 6
sherryy/math12k
Viewer • Updated • 12.5k • 14
sherryy/random-acts-of-pizza
Viewer • Updated • 59.5k • 36
sherryy/test_data
Updated • 4