How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models Paper • 2604.21106 • Published Apr 27 • 9
Memory-Efficient Looped Transformer: Decoupling Compute from Memory in Looped Language Models Paper • 2605.07721 • Published 24 days ago • 29