OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification Paper • 2606.01476 • Published 5 days ago • 8
On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters Paper • 2606.02437 • Published 4 days ago • 169
PANDO: Efficient Multimodal AI Agents via Online Skill Distillation Paper • 2605.24785 • Published 10 days ago • 11
Soap2Soap: Long Cinematic Video Remaking via Multi-Agent Collaboration Paper • 2605.17423 • Published 19 days ago • 33
Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs Paper • 2505.11277 • Published May 16, 2025 • 29
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 9 days ago • 419
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 16 days ago • 204
Training-Free Dense Hand Contact Estimation with Multi-Modal Large Language Models Paper • 2605.05886 • Published 29 days ago • 3
PageGuide: Browser extension to assist users in navigating a webpage and locating information Paper • 2604.23772 • Published Apr 26 • 7
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering Paper • 2604.08224 • Published Apr 9 • 52
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence? Paper • 2604.03016 • Published Apr 3 • 37