Xuekang Wang
wxk123
ยท
AI & ML interests
LLM Safety
Recent Activity
authored a paper about 22 hours ago
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning updated a model 6 months ago
wxk123/llama-3.2-1b-instruct-augmented