LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models
Abstract
LiveWorld addresses the out-of-sight dynamics problem in video world models by introducing a persistent global state representation that maintains continuous evolution of dynamic entities beyond the observer's field of view.
Recent generative video world models aim to simulate visual environment evolution, allowing an observer to interactively explore the scene via camera control. However, they implicitly assume that the world only evolves within the observer's field of view. Once an object leaves the observer's view, its state is "frozen" in memory, and revisiting the same region later often fails to reflect events that should have occurred in the meantime. In this work, we identify and formalize this overlooked limitation as the "out-of-sight dynamics" problem, which impedes video world models from representing a continuously evolving world. To address this issue, we propose LiveWorld, a novel framework that extends video world models to support persistent world evolution. Instead of treating the world as static observational memory, LiveWorld models a persistent global state composed of a static 3D background and dynamic entities that continue evolving even when unobserved. To maintain these unseen dynamics, LiveWorld introduces a monitor-based mechanism that autonomously simulates the temporal progression of active entities and synchronizes their evolved states upon revisiting, ensuring spatially coherent rendering. For evaluation, we further introduce LiveBench, a dedicated benchmark for the task of maintaining out-of-sight dynamics. Extensive experiments show that LiveWorld enables persistent event evolution and long-term scene consistency, bridging the gap between existing 2D observation-based memory and true 4D dynamic world simulation. The baseline and benchmark will be publicly available at https://zichengduan.github.io/LiveWorld/index.html.
Community
Introducing LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models ๐
Current video world models have a critical flaw: they freeze objects the moment they leave the camera's view, completely ignoring elapsed time. We formalize this as the Out-of-Sight Dynamics problem.
LiveWorld solves this by explicitly decoupling World Evolution from Observation Rendering:
๐น Virtual Monitors: We register "Monitors" that autonomously fast-forward the temporal progression of unobserved active entities in the background. When you look back, their states are up-to-date.
๐น Tractable Efficiency: We factorize the world into a static 3D background (accumulated via SLAM) and sparse dynamic entities, keeping computation highly manageable.
๐น LiveBench: We also introduce the first dedicated benchmark for evaluating long-horizon, out-of-sight dynamics.
With such a design, LiveWorld narrows the gap closer between static memory and persistent 4D world simulation
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories (2026)
- UCM: Unifying Camera Control and Memory with Time-aware Positional Encoding Warping for World Models (2026)
- Geometry-Aware Rotary Position Embedding for Consistent Video World Model (2026)
- Beyond Pixel Histories: World Models with Persistent 3D State (2026)
- Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory (2026)
- CamDirector: Towards Long-Term Coherent Video Trajectory Editing (2026)
- WorldStereo: Bridging Camera-Guided Video Generation and Scene Reconstruction via 3D Geometric Memories (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper
