DFlash Collection Block Diffusion for Flash Speculative Decoding โข 21 items โข Updated 15 days ago โข 120
Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora Paper โข 2604.24819 โข Published 28 days ago โข 89
view post Post 1752 Interested in RL training environments?We just released a beginner-friendly walkthrough notebook!Train a model to play Wordle using TRL + OpenEnv (TextArena) + GRPO + vLLM.happy learning! ๐ฑNotebook: https://github.com/huggingface/trl/blob/main/examples/notebooks/openenv_wordle_grpo.ipynbOpenEnv guide in TRL: https://huggingface.co/docs/trl/main/en/openenv See translation ๐ 8 8 + Reply