Zhendong Chu
Wesley123
AI & ML interests
Natural Language Processing, Recommender Systems
Recent Activity
upvoted a paper 1 day ago
You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories upvoted a paper 8 months ago
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning upvoted a paper 12 months ago
WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement
LearningOrganizations
None yet