Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria Paper • 2605.08354 • Published 5 days ago • 20
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published 6 days ago • 92
Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation Paper • 2605.03849 • Published 8 days ago • 122
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper • 2604.22748 • Published 19 days ago • 226
SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization Paper • 2604.02268 • Published Apr 2 • 101
GEMS: Agent-Native Multimodal Generation with Memory and Skills Paper • 2603.28088 • Published Mar 30 • 85
LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published Mar 29 • 146
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 350
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills Paper • 2603.25158 • Published Mar 26 • 52
SkillOrchestra: Learning to Route Agents via Skill Transfer Paper • 2602.19672 • Published Feb 23 • 58
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning Paper • 2603.21065 • Published Mar 22 • 77
TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas Paper • 2603.16448 • Published Mar 17 • 58
Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model Paper • 2603.05438 • Published Mar 5 • 40
Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution Paper • 2512.10696 • Published Dec 11, 2025 • 3
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning Paper • 2505.24298 • Published May 30, 2025 • 34