4 2 2

Rohan Surana

rohan2810

rohan2810

AI & ML interests

None yet

Recent Activity

upvoted a paper about 22 hours ago

MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization

submitted a paper 1 day ago

F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking

upvoted a paper 9 days ago

Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning

View all activity

Organizations

None yet

upvoted a paper about 22 hours ago

MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization

Paper • 2605.10784 • Published 5 days ago • 1

submitted a paper to Daily Papers 1 day ago

F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking

Paper • 2605.12995 • Published 3 days ago • 1

upvoted a paper 9 days ago

Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning

Paper • 2605.02913 • Published Apr 8 • 9

submitted a paper to Daily Papers 9 days ago

Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning

Paper • 2605.02913 • Published Apr 8 • 9

New activity in noraizz1323/Qwen3-4B-Instruct-2507-SFT-eli5-4k 10 days ago

Upload folder using huggingface_hub

#1 opened 10 days ago by

rohan2810

updated a model about 2 months ago

rohan2810/movielens_heissen_theta_normalized_massdpo_theta_normalized_llama-3.2-3b-instruct_0.1_3_lastlaye

Updated Mar 28

published a model about 2 months ago

rohan2810/movielens_heissen_theta_normalized_massdpo_theta_normalized_llama-3.2-3b-instruct_0.1_3_lastlaye

Updated Mar 28

updated a model about 2 months ago

rohan2810/debug-lastlayer-theta3-rerun-20260328-001108

Updated Mar 28

published a model about 2 months ago

rohan2810/debug-lastlayer-theta3-rerun-20260328-001108

Updated Mar 28

updated a model about 2 months ago

rohan2810/debug-lastlayer-theta3-rerun-20260327-204700

Updated Mar 28

published a model about 2 months ago

rohan2810/debug-lastlayer-theta3-rerun-20260327-204700

Updated Mar 28

updated a model about 2 months ago

rohan2810/debug-lastlayer-theta3-foreground-20260327-140219

Updated Mar 27

published a model about 2 months ago

rohan2810/debug-lastlayer-theta3-foreground-20260327-140219

Updated Mar 27

updated 2 models about 2 months ago

rohan2810/movielens-llama-massdpo-theta-neg3-20260325-145401

Updated Mar 26

rohan2810/movielens-llama-3.2-3b-instruct-massdpo-theta-neg10-20260326-053614

Updated Mar 26

published a model about 2 months ago

rohan2810/movielens-llama-3.2-3b-instruct-massdpo-theta-neg10-20260326-053614

Updated Mar 26

updated a model about 2 months ago

rohan2810/movielens-llama-3.2-3b-instruct-massdpo-theta-neg10-20260326-021023

Updated Mar 26

published a model about 2 months ago

rohan2810/movielens-llama-3.2-3b-instruct-massdpo-theta-neg10-20260326-021023

Updated Mar 26

updated a model about 2 months ago

rohan2810/movielens-llama-3.2-3b-instruct-sdpo-all-neg19-20260326-021023

Updated Mar 26

published a model about 2 months ago

rohan2810/movielens-llama-3.2-3b-instruct-sdpo-all-neg19-20260326-021023

Updated Mar 26

Rohan Surana

AI & ML interests

Recent Activity

Organizations

rohan2810's activity

Upload folder using huggingface_hub