arvindcr4/tinker-rl-frontier_gsm8k_nemotron-120b-nemotron-120b Reinforcement Learning • Updated 11 days ago
arvindcr4/tinker-rl-frontier_gsm8k_deepseek-v3.1-deepseek-v3.1 Reinforcement Learning • Updated 11 days ago
arvindcr4/tinker-rl-w1_qwen3-8b-base-qwen3-8b-base-s42-run1 Reinforcement Learning • Updated 11 days ago
arvindcr4/tinker-rl-w1_qwen3-8b-base-qwen3-8b-base-s42-run2 Reinforcement Learning • Updated 11 days ago
arvindcr4/tinker-rl-w1_llama31-8b-base-llama-3.1-8b-s42 Reinforcement Learning • Updated 11 days ago
arvindcr4/tinker-rl-scale_gsm8k_llama-8b-inst-llama-8b-inst Reinforcement Learning • Updated 11 days ago
arvindcr4/tinker-rl-distillation_off_trajectory-qwen3-8b-base Reinforcement Learning • Updated 11 days ago
arvindcr4/tinker-rl-cross_tool_llama-8b-inst-llama-8b-inst Reinforcement Learning • Updated 11 days ago
arvindcr4/tinker-rl-arithmetic_trajectory-llama-3.2-1b Reinforcement Learning • Updated 11 days ago
arvindcr4/tinker-rl-arch_gsm8k_gpt-oss-20b-gpt-oss-20b Reinforcement Learning • Updated 11 days ago
arvindcr4/tinker-rl-bench-frontier_gsm8k_nemotron-120b Reinforcement Learning • Updated 12 days ago