Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment Paper • 2604.19548 • Published 16 days ago • 16
Reasoning Implicit Sentiment with Chain-of-Thought Prompting Paper • 2305.11255 • Published May 18, 2023 • 2
CMNER: A Chinese Multimodal NER Dataset based on Social Media Paper • 2402.13693 • Published Feb 21, 2024
PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis Paper • 2408.09481 • Published Aug 18, 2024 • 1
LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model Paper • 2304.06248 • Published Apr 13, 2023
NUS-Emo at SemEval-2024 Task 3: Instruction-Tuning LLM for Multimodal Emotion-Cause Analysis in Conversations Paper • 2501.17261 • Published Aug 22, 2024
On Path to Multimodal Generalist: General-Level and General-Bench Paper • 2505.04620 • Published May 7, 2025 • 83
UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist Paper • 2511.08521 • Published Nov 11, 2025 • 39
FormFactory: An Interactive Benchmarking Suite for Multimodal Form-Filling Agents Paper • 2506.01520 • Published Jun 2, 2025
Unveiling the Cognitive Compass: Theory-of-Mind-Guided Multimodal Emotion Reasoning Paper • 2602.00971 • Published Feb 28 • 1
Zero-Shot Conversational Stance Detection: Dataset and Approaches Paper • 2506.17693 • Published Jun 21, 2025
Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment Paper • 2604.19548 • Published 16 days ago • 16
VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification Paper • 2604.01569 • Published Apr 2 • 13
OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning Paper • 2603.24458 • Published Mar 25 • 9
Towards Semantic Equivalence of Tokenization in Multimodal LLM Paper • 2406.05127 • Published Jun 7, 2024
So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection Paper • 2505.18660 • Published May 24, 2025 • 2
Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models Paper • 2505.24164 • Published May 30, 2025
SMAP: Self-supervised Motion Adaptation for Physically Plausible Humanoid Whole-body Control Paper • 2505.19463 • Published May 26, 2025
MCM-DPO: Multifaceted Cross-Modal Direct Preference Optimization for Alt-text Generation Paper • 2510.00647 • Published Oct 1, 2025