Chenglin
Chain123
AI & ML interests
None yet
Organizations
None yet
Multi-module
-
VIDEOP2R: Video Understanding from Perception to Reasoning
Paper • 2511.11113 • Published • 111 -
MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs
Paper • 2511.14159 • Published • 25 -
REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding
Paper • 2511.13026 • Published • 26 -
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
Paper • 2511.14582 • Published • 19
Agent
Reasoning
Multi-module
-
VIDEOP2R: Video Understanding from Perception to Reasoning
Paper • 2511.11113 • Published • 111 -
MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs
Paper • 2511.14159 • Published • 25 -
REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding
Paper • 2511.13026 • Published • 26 -
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
Paper • 2511.14582 • Published • 19