MedMCP-Calc: Benchmarking LLMs for Realistic Medical Calculator Scenarios via MCP Integration Paper • 2601.23049 • Published 12 days ago • 1
CP-Env: Evaluating Large Language Models on Clinical Pathways in a Controllable Hospital Environment Paper • 2512.10206 • Published Dec 11, 2025 • 1
MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling Paper • 2410.13610 • Published Oct 17, 2024 • 1
DiagnosisArena: Benchmarking Diagnostic Reasoning for Large Language Models Paper • 2505.14107 • Published May 20, 2025 • 1