WALAR - a lyf07 Collection

Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

lyf07 's Collections

WALAR

updated 3 days ago

Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual Translation

lyf07/LLaMAX3-8B-Alpaca-WALAR

8B • Updated 3 days ago • 42
lyf07/Qwen3-8B-WALAR

8B • Updated 3 days ago • 53
lyf07/Translategemma-4B-it-WALAR

769k • Updated 3 days ago • 43
Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual Translation

Paper • 2603.13045 • Published 7 days ago • 1

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs