From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space Paper • 2604.14142 • Published 1 day ago • 19
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 1 day ago • 87
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper • 2604.12627 • Published 2 days ago • 89
SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks Paper • 2604.08865 • Published 6 days ago • 26
From Word to World: Can Large Language Models be Implicit Text-based World Models? Paper • 2512.18832 • Published Dec 21, 2025 • 15
SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks Paper • 2604.08865 • Published 6 days ago • 26