From Correctness to Utility: Gain-Based Prefix Evaluation for LLM Reasoning Paper • 2606.07190 • Published 6 days ago • 21
Agentic Environment Engineering for Large Language Models: A Survey of Environment Modeling, Synthesis, Evaluation, and Application Paper • 2606.12191 • Published 1 day ago • 53