arxiv:2601.00514
Liv d'Aliberti
od2961
·
AI & ML interests
None yet
Organizations
models 44
od2961/Qwen2.5-1.5B-Open-R1-GRPO-math-2k
2B • Updated • 1
od2961/Qwen2.5-7B-Open-R1-MaxEnt-GRPO-math-2k
333k • Updated
od2961/Qwen2.5-7B-Open-R1-GRPO-math-2k
333k • Updated • 1
od2961/Qwen2.5-1.5B-Open-R1-MaxEnt-GRPO-math-v1
2B • Updated
od2961/Qwen2.5-1.5B-Open-R1-MaxEnt-GRPO-BASELINE-math-v1
2B • Updated
od2961/Qwen2.5-1.5B-Open-R1-GRPO-math-v1
Text Generation • 2B • Updated • 290
od2961/Qwen2.5-1.5B-Open-R1-GRPO-math-v1-grpoonly
Updated
od2961/Qwen2.5-1.5B-OpenR1-GRPO-GUN
2B • Updated
od2961/Qwen2.5-1.5B-OpenR1-GRAIL-WAGE
2B • Updated • 2
od2961/Qwen2.5-1.5B-OpenR1-GRAIL-GUN
2B • Updated • 1
datasets 17
od2961/illusion-of-reasoning-main-traces
Viewer • Updated • 1.24M • 7 • 1
od2961/deepseek-r1-math500-t0
Viewer • Updated • 500 • 7
od2961/gpt4o-math500-t005
Viewer • Updated • 500 • 9
od2961/deepseek-r1-math500-t005
Viewer • Updated • 500 • 8
od2961/gpt4o-math500-t0
Viewer • Updated • 500 • 11
od2961/grail-wage
Viewer • Updated • 18.9k • 4
od2961/grail-gun
Viewer • Updated • 4.94k • 9
od2961/grail-codeocean-raw
Preview • Updated • 30
od2961/rush4-5-6-balanced
Viewer • Updated • 300k • 5
od2961/rush4-5-balanced
Viewer • Updated • 44.7k • 3