HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published 3 days ago • 25
view article Article NEO-unify: Building Native Multimodal Unified Models End to End about 1 month ago • 107
LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels Paper • 2603.19312 • Published 22 days ago • 21
LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs Paper • 2603.19217 • Published 16 days ago • 28
When Does Sparsity Mitigate the Curse of Depth in LLMs Paper • 2603.15389 • Published 20 days ago • 5
HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions Paper • 2603.15612 • Published 19 days ago • 152
Kinema4D: Kinematic 4D World Modeling for Spatiotemporal Embodied Simulation Paper • 2603.16669 • Published 19 days ago • 70
Flash-KMeans: Fast and Memory-Efficient Exact K-Means Paper • 2603.09229 • Published 26 days ago • 82
DICE Collection A series of diffusion language models tailored for CUDA kernel generation. • 4 items • Updated Feb 13 • 3
The Trinity of Consistency as a Defining Principle for General World Models Paper • 2602.23152 • Published Feb 26 • 201
NEO1_0 Collection From Pixels to Words -- Towards Native Vision-Language Primitives at Scale • 7 items • Updated Jan 27 • 9
Waypoint-1 Collection The first real time diffusion world model designed for consumer hardware • 3 items • Updated Jan 30 • 8
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights Paper • 2512.01816 • Published Dec 1, 2025 • 94
Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems Paper • 2512.24385 • Published Dec 30, 2025 • 8
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation Paper • 2512.23705 • Published Dec 29, 2025 • 45
OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding Paper • 2512.23646 • Published Dec 29, 2025 • 15
Autoregressive Image Generation with Randomized Parallel Decoding Paper • 2503.10568 • Published Mar 13, 2025 • 9