Rethinking Cross-Layer Information Routing in Diffusion Transformers Paper • 2605.20708 • Published 6 days ago • 97
LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws Paper • 2605.23901 • Published 4 days ago • 9
Rethinking Muon Beyond Pretraining: Spectral Failures and High-Pass Remedies for VLA and RLVR Paper • 2605.19282 • Published 7 days ago • 6
LatentUMM: Dual Latent Alignment for Unified Multimodal Models Paper • 2605.17766 • Published 8 days ago • 6