Redesign reward for discrimination: efficiency multiplier, strict penalties, stretch bonus, start at level 1 46f0850 Aswini-Kumar commited on Apr 26
refactor: extract agent_utils.py (shared prompt/commands/server utils), simplify reward to env+format, add audit.py 51a79ee Aswini-Kumar commited on Apr 26
Data-Centric AI RL Environment — OpenEnv Hackathon Submission 71dc210 Aswini-Kumar commited on Apr 25