Papers
arxiv:2605.30855

Robust Dreamer: Deviation-Aware Latent Gaussian Memory for Action-Controlled AR Video Generation

Published on May 29
Authors:
,
,
,
,
,
,
,

Abstract

Robust Dreamer addresses challenges in 3D-aware video generation through latent Gaussian memory and deviation learning to maintain visual fidelity and 3D consistency over long sequences.

AI-generated summary

Frame-wise action-controlled image-to-video generation is a promising paradigm for interactive world simulation, where each control signal should elicit an immediate visual response. However, maintaining visual fidelity and 3D consistency over long autoregressive rollouts remains challenging. Existing 3D-aware methods often suffer from catastrophic drift due to two impediments: information loss from Latent--RGB Cycling, where generated latents are repeatedly decoded to RGB and re-encoded for future conditioning, and the training--inference gap induced by the error-free hypothesis, where clean training memory fails to match prediction-corrupted inference memory. To address these challenges, we present Robust Dreamer, a memory-augmented framework built around how to design 3D memory and how to use it robustly. First, we introduce Latent Gaussian Memory, which anchors diffusion latents inherited from the generation process to Gaussian primitives and recalls them via latent-space Gaussian splatting. This provides dense, geometry-aware, view-aligned conditioning while avoiding accumulated degradation from repeated VAE conversion. Second, we propose Deviation Learning with Dynamic Deviation Archive, which synthesizes rollout-induced latent deviations through a one-step approximation, stores them by autoregressive stage and denoising timestamp, and injects them into historical memory during training. This exposes the generator to realistic corrupted memory states and teaches internal correction before inference. Experiments on ScanNet, DL3DV, and OmniWorldGame demonstrate state-of-the-art long-horizon performance.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.30855
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.30855 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.30855 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.30855 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.