Add reward tuning, improved prompt, eval harness, and serving Dockerfile b32b61a Nitishkumar-ai Claude Opus 4.6 commited on about 5 hours ago
Add smoke test for random episodes and initial simulated rewards data 1f65720 Nitishkumar-ai commited on about 7 hours ago