SafeDiffusion-R1: Online Reward Steering for Safe Diffusion Post-Training Paper • 2605.18719 • Published May 18 • 6 • 2
Risk Under Pressure: Compute-Aware Evaluation of Adversarial Robustness in Language Models Paper • 2606.11409 • Published 23 days ago • 9 • 4