hamishivi/swerl_qwen35_9b_base_tmax_10k_grpo_mask_no_submit_10pct_step160 9B • Updated about 2 hours ago
hamishivi/swerl_qwen35_9b_base_tmax_10k_grpo_mask_no_submit__42__1777143486_step_200 9B • Updated 2 days ago • 155
hamishivi/swerl_qwen35_9b_base_tmax_10k_grpo_mask_overlong__42__1777163763_step_200 9B • Updated 2 days ago • 155
hamishivi/vip_grpo_base_p32_2403_qwen3_4b_math__1__1774385112_step1000 196k • Updated 5 days ago • 297
hamishivi/swerl_qwen35_9b_base_agent_task_combined_grpo__42__1776554202_step100 9B • Updated 8 days ago • 306