Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published 11 days ago • 8
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Paper • 2603.25730 • Published 4 days ago • 38
RealMaster: Lifting Rendered Scenes into Photorealistic Video Paper • 2603.23462 • Published 6 days ago • 29
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG Paper • 2603.23497 • Published 6 days ago • 86