VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model Paper • 2407.06491 • Published Jul 9, 2024
Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method Paper • 2501.00584 • Published Dec 31, 2024
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models Paper • 2512.16561 • Published Dec 18, 2025 • 20
LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization Paper • 2602.02341 • Published 5 days ago • 1
LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization Paper • 2602.02341 • Published 5 days ago • 1
p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay Paper • 2412.04449 • Published Dec 5, 2024 • 7