Commit History

SFT eval on 22-task held-out split β€” fill in leaderboard
2e1dd84

bpHigh Claude Opus 4.7 (1M context) commited on

Move SUPPORTS_CONCURRENT_SESSIONS from module-level to class attribute
90a25f6

bpHigh Claude Opus 4.7 (1M context) commited on

Enable concurrent sessions in env for GRPO training
99c16d0

bpHigh Claude Opus 4.7 (1M context) commited on

Attribute hand-curated Round-1 tasks to Finch + list ALL 119 tasks in openenv.yaml
8d80d79

bpHigh Claude Opus 4.7 (1M context) commited on

Phase 11.7: interactive Prev/Next/Play replay (was static wall of HTML)
d2310e1

bpHigh Claude Opus 4.7 (1M context) commited on

Add raw_logs.txt + HF Job + adapter links to dashboard and README
f2e02e4

bpHigh Claude Opus 4.7 (1M context) commited on

Phase 11.6: Kimi-K2.5 best-run replays in dashboard
3e65e46

bpHigh Claude Opus 4.7 (1M context) commited on

Fix blank Gradio iframe β€” set root_path='/dashboard' on mount
05b7358

bpHigh Claude Opus 4.7 (1M context) commited on

Phase 11.5: Gradio dashboard at /dashboard (now the Space's base_path)
ae0420a

bpHigh Claude Opus 4.7 (1M context) commited on

Phase 9: hard early-submit gate at env layer (kills the exploit class)
9033aad

bpHigh Claude Opus 4.7 (1M context) commited on

Add extended arena stuff
a57d682

bpHigh commited on

Graduated code step rewards based on execution success and code substance
b448320

bpHigh commited on

Financial Task Environment β€” code execution with real xlsx
cd4b800

bpHigh commited on