Yaorui SHI

yrshi

syr-cn

AI & ML interests

None yet

Recent Activity

upvoted a paper about 13 hours ago

Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria

upvoted a paper about 15 hours ago

Rubric-based On-policy Distillation

upvoted a paper 5 days ago

Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning

View all activity

Organizations

upvoted a paper about 13 hours ago

Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria

Paper • 2605.08354 • Published 5 days ago • 20

upvoted a paper about 15 hours ago

Rubric-based On-policy Distillation

Paper • 2605.07396 • Published 5 days ago • 35

upvoted a paper 5 days ago

Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning

Paper • 2605.06130 • Published 6 days ago • 92

upvoted a paper 6 days ago

Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation

Paper • 2605.03849 • Published 8 days ago • 122

upvoted 2 papers 12 days ago

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

Paper • 2604.22748 • Published 19 days ago • 226

Recursive Multi-Agent Systems

Paper • 2604.25917 • Published 15 days ago • 264

upvoted 5 papers about 1 month ago

upvoted 5 papers about 2 months ago

SkillOrchestra: Learning to Route Agents via Skill Transfer

Paper • 2602.19672 • Published Feb 23 • 58

LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

Paper • 2603.21065 • Published Mar 22 • 77

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

Paper • 2603.16448 • Published Mar 17 • 58

Online Experiential Learning for Language Models

Paper • 2603.16856 • Published Mar 17 • 59

Attention Residuals

Paper • 2603.15031 • Published Mar 16 • 184

upvoted a paper 2 months ago

Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model

Paper • 2603.05438 • Published Mar 5 • 40

liked a dataset 2 months ago

OldKingMeister/lmsys-arena-processed-data

Preview • Updated Mar 7 • 115 • 1

upvoted 2 papers 2 months ago

Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution

Paper • 2512.10696 • Published Dec 11, 2025 • 3

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Paper • 2505.24298 • Published May 30, 2025 • 34

Yaorui SHI

AI & ML interests

Recent Activity

Organizations

yrshi's activity