Video Generation Models as World Models: Efficient Paradigms, Architectures and Algorithms
Abstract
Video generation models capable of simulating complex physical dynamics and long-horizon causality require efficient frameworks to become practical world simulators for interactive applications.
The rapid evolution of video generation has enabled models to simulate complex physical dynamics and long-horizon causalities, positioning them as potential world simulators. However, a critical gap still remains between the theoretical capacity for world simulation and the heavy computational costs of spatiotemporal modeling. To address this, we comprehensively and systematically review video generation frameworks and techniques that consider efficiency as a crucial requirement for practical world modeling. We introduce a novel taxonomy in three dimensions: efficient modeling paradigms, efficient network architectures, and efficient inference algorithms. We further show that bridging this efficiency gap directly empowers interactive applications such as autonomous driving, embodied AI, and game simulation. Finally, we identify emerging research frontiers in efficient video-based world modeling, arguing that efficiency is a fundamental prerequisite for evolving video generators into general-purpose, real-time, and robust world simulators.
Community
we comprehensively and systematically review video generation frameworks and techniques that consider efficiency as a crucial requirement for practical world modeling.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Toward Physically Consistent Driving Video World Models under Challenging Trajectories (2026)
- The Trinity of Consistency as a Defining Principle for General World Models (2026)
- Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models (2026)
- DreamerAD: Efficient Reinforcement Learning via Latent World Model for Autonomous Driving (2026)
- From Virtual Environments to Real-World Trials: Emerging Trends in Autonomous Driving (2026)
- GigaWorld-Policy: An Efficient Action-Centered World--Action Model (2026)
- COMBAT: Conditional World Models for Behavioral Agent Training (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2603.28489 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper