LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing Paper • 2606.26740 • Published 8 days ago • 78
Agentic Abstention: Do Agents Know When to Stop Instead of Act? Paper • 2606.28733 • Published 6 days ago • 137
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection Paper • 2303.05499 • Published Mar 9, 2023 • 8
Wan-Streamer v0.1: End-to-end Real-time Interactive Foundation Models Paper • 2606.25041 • Published 10 days ago • 111
Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 10 days ago • 144
PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models Paper • 2606.19534 • Published 16 days ago • 64
Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance Paper • 2606.19195 • Published 16 days ago • 139
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 17 days ago • 209
JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence Paper • 2606.14777 • Published 23 days ago • 208
VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models Paper • 2606.16140 • Published 18 days ago • 121
OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data Paper • 2606.13432 • Published 22 days ago • 113
Redesign Mixture-of-Experts Routers with Manifold Power Iteration Paper • 2606.12397 • Published 23 days ago • 89
Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution Paper • 2606.06492 • Published 29 days ago • 95