PIKA: Expert-Level Synthetic Datasets for Post-Training Alignment from Scratch Paper • 2510.06670 • Published Oct 8, 2025 • 1
Aligning Large Language Models via Fully Self-Synthetic Data Paper • 2510.06652 • Published Oct 8, 2025 • 1
GRLO: Towards Generalizable Reinforcement Learning in Open-Ended Environments from Zero Paper • 2605.15464 • Published 14 days ago
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs Paper • 2603.05890 • Published Mar 6 • 93
Aligning Large Language Models via Fully Self-Synthetic Data Paper • 2510.06652 • Published Oct 8, 2025 • 1
PIKA: Expert-Level Synthetic Datasets for Post-Training Alignment from Scratch Paper • 2510.06670 • Published Oct 8, 2025 • 1