SFT eval on 22-task held-out split β fill in leaderboard 2e1dd84 bpHigh Claude Opus 4.7 (1M context) commited on Apr 26
Move SUPPORTS_CONCURRENT_SESSIONS from module-level to class attribute 90a25f6 bpHigh Claude Opus 4.7 (1M context) commited on Apr 26
Enable concurrent sessions in env for GRPO training 99c16d0 bpHigh Claude Opus 4.7 (1M context) commited on Apr 26
Attribute hand-curated Round-1 tasks to Finch + list ALL 119 tasks in openenv.yaml 8d80d79 bpHigh Claude Opus 4.7 (1M context) commited on Apr 26
Phase 11.7: interactive Prev/Next/Play replay (was static wall of HTML) d2310e1 bpHigh Claude Opus 4.7 (1M context) commited on Apr 26
Add raw_logs.txt + HF Job + adapter links to dashboard and README f2e02e4 bpHigh Claude Opus 4.7 (1M context) commited on Apr 26
Phase 11.6: Kimi-K2.5 best-run replays in dashboard 3e65e46 bpHigh Claude Opus 4.7 (1M context) commited on Apr 26
Fix blank Gradio iframe β set root_path='/dashboard' on mount 05b7358 bpHigh Claude Opus 4.7 (1M context) commited on Apr 26
Phase 11.5: Gradio dashboard at /dashboard (now the Space's base_path) ae0420a bpHigh Claude Opus 4.7 (1M context) commited on Apr 26
Phase 9: hard early-submit gate at env layer (kills the exploit class) 9033aad bpHigh Claude Opus 4.7 (1M context) commited on Apr 25
Graduated code step rewards based on execution success and code substance b448320 bpHigh commited on Apr 8