CREATE: Testing LLMs for Associative Creativity Paper • 2603.09970 • Published 12 days ago • 14
Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents Paper • 2602.16699 • Published Feb 18 • 15
OpenThoughts: Data Recipes for Reasoning Models Paper • 2506.04178 • Published Jun 4, 2025 • 53
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models Paper • 2505.13444 • Published May 19, 2025 • 16