Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 2 days ago • 37
MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation Paper • 2606.02470 • Published 4 days ago • 16