·
AI & ML interests
None yet
Organizations
rasdani/deepseek_r1_qwen14b_swe_rl_8k
15B • Updated
• 1
• 1
rasdani/deepseek_r1_llama_8b_swe_rl_8k_12_epochs
8B • Updated
• 1
rasdani/qwen3_8b_swe_rl_8k
rasdani/deepseek_r1_7b_gh_patches_2k_fixed_reward
8B • Updated
• 1
rasdani/deepseek_r1_7b_gh_patches_2k
8B • Updated
rasdani/crux-eval_math-eval-logs
Updated
rasdani/git-diff-Qwen-4B-10k
4B • Updated
rasdani/git-diff-Qwen-4B-10k-checkpoints
Updated
rasdani/git-diff-Qwen-4B-32k-checkpoints
Updated
rasdani/git-diff-Qwen-4B-30k
4B • Updated
4B • Updated
• 5
rasdani/git-diff-Qwen-1.7B
2B • Updated
rasdani/git-diff-Qwen-1.7-B
2B • Updated
rasdani/simple-math-Qwen-1.5B
2B • Updated
rasdani/qwen3_0_6b_function_rm
0.8B • Updated
• 1
rasdani/Qwen2.5-0.5B-simpleRL-Zoo-8192k
0.5B • Updated
rasdani/Qwen2.5-0.5B-simpleRL-Zoo
Text Generation
• 0.5B • Updated
• 1
rasdani/smolR1-Qwen2.5-0.5B
Text Generation
• 0.5B • Updated
• 1
rasdani/Qwen2.5-0.5B-simpleRL-Zoo-no-KL
Updated
rasdani/Qwen2.5-0.5B-simpleRL-Zoo-3072k
Updated
rasdani/Qwen2.5-0.5B-simpleRL-Zoo-4096k
Updated
rasdani/Qwen2.5-0.5B-simpleRL-Zoo-2560k
Updated
rasdani/Qwen2.5-0.5B-simpleRL-Zoo-2048k
Updated
rasdani/Qwen2.5-0.5B-simpleRL-Zoo-first-try
0.5B • Updated
rasdani/Qwen-1.5B-Distill-GRPO
Text Generation
• 2B • Updated
rasdani/Qwen-0.5B-Instruct-GRPO
Updated
rasdani/gsm8k_qwen2.5-0.5b
0.5B • Updated
• 1
rasdani/Qwen2.5-1.5B-Open-R1-Code-GRPO
Updated
rasdani/Qwen2.5-0.5B-Open-R1-Code-GRPO
Text Generation
• 0.6B • Updated
rasdani/Qwen2.5-7B-Instruct-GRPO-unsloth
Text Generation
• 8B • Updated
• 1