Rongman Xu
rowanserena
ยท
AI & ML interests
None yet
Recent Activity
upvoted a paper about 15 hours ago
OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions upvoted a paper about 15 hours ago
TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents upvoted a paper about 2 months ago
A^3-Bench: Benchmarking Memory-Driven Scientific Reasoning via Anchor and Attractor Activation Organizations
None yet