Why Far Looks Up: Probing Spatial Representation in Vision-Language Models Paper • 2605.30161 • Published 6 days ago • 56
Personalize-then-Store: Benchmarking and Learning Personalized Memory for Long-horizon Agents Paper • 2605.25535 • Published 9 days ago • 41
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding Paper • 2411.04952 • Published Nov 7, 2024 • 29