Teaching Thinking Models to Reason with Tools: A Full-Pipeline Recipe for Tool-Integrated Reasoning Paper • 2605.06326 • Published 9 days ago • 24
CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics Paper • 2508.18124 • Published Aug 25, 2025 • 49
HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark? Paper • 2509.07894 • Published Sep 9, 2025 • 32
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published Nov 17, 2025 • 134
P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads Paper • 2602.09443 • Published Feb 10 • 59