Andreas Stöffelbauer
andreasskyscanner
AI & ML interests
None yet
Recent Activity
upvoted a paper about 4 hours ago
You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass upvoted a paper about 12 hours ago
Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation upvoted a paper about 12 hours ago
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and RecipeOrganizations
None yet