MedMCP-Calc: Benchmarking LLMs for Realistic Medical Calculator Scenarios via MCP Integration Paper • 2601.23049 • Published 12 days ago • 1
DiagnosisArena: Benchmarking Diagnostic Reasoning for Large Language Models Paper • 2505.14107 • Published May 20, 2025 • 1
MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling Paper • 2410.13610 • Published Oct 17, 2024 • 1
CP-Env: Evaluating Large Language Models on Clinical Pathways in a Controllable Hospital Environment Paper • 2512.10206 • Published Dec 11, 2025 • 1
MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published 2 days ago • 142
MedMCP-Calc: Benchmarking LLMs for Realistic Medical Calculator Scenarios via MCP Integration Paper • 2601.23049 • Published 12 days ago • 1
daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently Paper • 2602.02619 • Published 9 days ago • 49
CP-Env: Evaluating Large Language Models on Clinical Pathways in a Controllable Hospital Environment Paper • 2512.10206 • Published Dec 11, 2025 • 1
MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling Paper • 2410.13610 • Published Oct 17, 2024 • 1
DiagnosisArena: Benchmarking Diagnostic Reasoning for Large Language Models Paper • 2505.14107 • Published May 20, 2025 • 1
SpeContext: Enabling Efficient Long-context Reasoning with Speculative Context Sparsity in LLMs Paper • 2512.00722 • Published Nov 30, 2025 • 16