·
AI & ML interests
None yet
Organizations
od2961/Qwen2.5-1.5B-Open-R1-GRPO-math-2k
2B • Updated • 1
od2961/Qwen2.5-7B-Open-R1-MaxEnt-GRPO-math-2k
333k • Updated od2961/Qwen2.5-7B-Open-R1-GRPO-math-2k
333k • Updated • 1
od2961/Qwen2.5-1.5B-Open-R1-MaxEnt-GRPO-math-v1
2B • Updated od2961/Qwen2.5-1.5B-Open-R1-MaxEnt-GRPO-BASELINE-math-v1
2B • Updated od2961/Qwen2.5-1.5B-Open-R1-GRPO-math-v1
Text Generation
• 2B • Updated • 290
od2961/Qwen2.5-1.5B-Open-R1-GRPO-math-v1-grpoonly
Updated
od2961/Qwen2.5-1.5B-OpenR1-GRPO-GUN
2B • Updated od2961/Qwen2.5-1.5B-OpenR1-GRAIL-WAGE
2B • Updated • 2
od2961/Qwen2.5-1.5B-OpenR1-GRAIL-GUN
2B • Updated • 1
od2961/Qwen2.5-1.5B-OpenR1-GRPO
2B • Updated • 1
od2961/Qwen2.5-1.5B-OpenR1-GRAIL
Text Generation
• 2B • Updated • 6
od2961/Llama-8B-Open-R1-GRPO-math-v2
8B • Updated • 1
od2961/Llama-8B-Open-R1-GRPO-math-v1
Updated
od2961/Qwen2.5-7B-Open-R1-GRPO-math-7b
Text Generation
• 8B • Updated od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v03
2B • Updated od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v04
Updated
od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v02
2B • Updated od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v01
2B • Updated od2961/Qwen2.5-1.5B-Open-R1-GRPO-carpark-v1
Updated
od2961/Qwen2.5-1.5B-OpenR1-no-GRAIL
2B • Updated od2961/Qwen2.5-1.5B-Open-R1-GRPO-math-v2
2B • Updated od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v1
2B • Updated od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v11
Updated
od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v10
Updated
od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v9
Updated
od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v7
2B • Updated • 1
od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v8
2B • Updated • 1
od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v6
od2961/Qwen2.5-1.5B-Open-R1-GRPO-Crosswords-v5
2B • Updated • 9