Sudong Wang PRO
xiao45791
AI & ML interests
None yet
Recent Activity
commentedon a paper about 19 hours ago
Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL authored a paper about 23 hours ago
Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL updated a dataset 2 days ago
prism-vlm/rl_dataset