Yun Qu
yunqu
AI & ML interests
None yet
Recent Activity
authored a paper about 21 hours ago
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex submitted a paper 1 day ago
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response SimplexOrganizations
None yet