OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning Paper • 2606.26790 • Published 4 days ago • 45
Wan-Streamer v0.1: End-to-end Real-time Interactive Foundation Models Paper • 2606.25041 • Published 6 days ago • 93
FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention Paper • 2606.09079 • Published 21 days ago • 64
When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents Paper • 2606.05806 • Published 25 days ago • 23
view article Article How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent nvidia • 24 days ago • 66
SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer Paper • 2605.15178 • Published May 14 • 91
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation Paper • 2605.13724 • Published May 13 • 105
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards Paper • 2605.10899 • Published May 11 • 79
Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL Paper • 2604.28123 • Published May 1 • 49
OceanPile: A Large-Scale Multimodal Ocean Corpus for Foundation Models Paper • 2605.00877 • Published Apr 25 • 15
Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction Paper • 2604.27221 • Published Apr 29 • 40
Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing Paper • 2604.22782 • Published Apr 3 • 8
Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows Paper • 2604.28139 • Published Apr 30 • 42