Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training Paper • 2603.12255 • Published 1 day ago • 63
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published 27 days ago • 52
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence Paper • 2512.16793 • Published Dec 18, 2025 • 75
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published Dec 15, 2025 • 74
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards Paper • 2512.00473 • Published Nov 29, 2025 • 26