Defining and Evaluating Visual Language Models' Basic Spatial Abilities: A Perspective from Psychometrics Paper • 2502.11859 • Published Feb 17, 2025
Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations Paper • 2506.04633 • Published Jun 5, 2025 • 21
PulseCheck457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal Models Paper • 2502.08636 • Published Feb 12, 2025
Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision Paper • 2506.06253 • Published Jun 6, 2025 • 9
Vision-Language-Action Models: Concepts, Progress, Applications and Challenges Paper • 2505.04769 • Published May 7, 2025 • 10
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published Jan 23, 2025 • 41
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers Paper • 2506.23918 • Published Jun 30, 2025 • 90
AMFT: Aligning LLM Reasoners by Meta-Learning the Optimal Imitation-Exploration Balance Paper • 2508.06944 • Published Aug 9, 2025 • 2
SpatialAct: Probing Spatial Reasoning-to-Action Capabilities of VLM Agents in 3D Scenes Paper • 2605.31148 • Published 7 days ago • 3