William Scott
trueza2s
ยท
AI & ML interests
None yet
Recent Activity
upvoted a paper about 5 hours ago
Video Models Can Reason with Verifiable Rewards upvoted a paper about 22 hours ago
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information upvoted a paper about 23 hours ago
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable RewardsOrganizations
None yet