XYX

xuyd16

3 11 1

AI & ML interests

None yet

Recent Activity

upvoted a paper 12 days ago

TREK: Distill to Explore, Reinforce to Refine

submitted a paper 12 days ago

TREK: Distill to Explore, Reinforce to Refine

upvoted a paper 19 days ago

TRIAGE: Role-Typed Credit Assignment for Agentic Reinforcement Learning

View all activity

Organizations

None yet

upvoted a paper 12 days ago

TREK: Distill to Explore, Reinforce to Refine

Paper • 2607.05339 • Published 14 days ago • 8

submitted a paper to Daily Papers 12 days ago

TREK: Distill to Explore, Reinforce to Refine

Paper • 2607.05339 • Published 14 days ago • 8

upvoted a paper 19 days ago

TRIAGE: Role-Typed Credit Assignment for Agentic Reinforcement Learning

Paper • 2606.32017 • Published 20 days ago • 12

authored a paper 2 months ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

Paper • 2605.12483 • Published May 12 • 10

upvoted a paper 2 months ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

Paper • 2605.12483 • Published May 12 • 10

submitted a paper to Daily Papers 2 months ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

Paper • 2605.12483 • Published May 12 • 10

upvoted a paper 3 months ago

TIP: Token Importance in On-Policy Distillation

Paper • 2604.14084 • Published Apr 15 • 15

liked a model 3 months ago

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 28 days ago • 1.49M • • 5.26k

upvoted 4 papers 3 months ago

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models

Paper • 2604.10866 • Published Apr 13 • 69

SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

Paper • 2604.14144 • Published Apr 15 • 63

RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time

Paper • 2604.11626 • Published Apr 13 • 103

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published Apr 15 • 168

submitted a paper to Daily Papers 3 months ago

TIP: Token Importance in On-Policy Distillation

Paper • 2604.14084 • Published Apr 15 • 15

submitted a paper to Daily Papers 4 months ago

PACED: Distillation at the Frontier of Student Competence

Paper • 2603.11178 • Published Mar 11 • 4

authored 4 papers 4 months ago

Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning

Paper • 2602.21420 • Published Feb 24 • 6

upvoted 2 papers 4 months ago

PACED: Distillation at the Frontier of Student Competence

Paper • 2603.11178 • Published Mar 11 • 4

Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning

Paper • 2602.21420 • Published Feb 24 • 6

XYX

AI & ML interests

Recent Activity

Organizations

xuyd16's activity