Attention Sinks Are Provably Necessary in Softmax Transformers: Evidence from Trigger-Conditional Tasks Paper • 2603.11487 • Published 2 days ago • 1
Outcome-Based RL Provably Leads Transformers to Reason, but Only With the Right Data Paper • 2601.15158 • Published Jan 21