view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego • Mar 10 • 164
Running 191 The ultimate guide to RL environments: building and scaling them in the LLM era 📝 191 Building and scaling RL environments for LLM training
posttrain_model_ckpts Collection LoRA checkpoints for post-training experiments on LLaMA-2-7B with various data selection methods (MMLU task). • 8 items • Updated Mar 18