Papers
arxiv:2602.18422

Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control

Published on Feb 20
ยท Submitted by
taesiri
on Feb 23
#3 Paper of the day
Authors:
,
,
,
,
,

Abstract

A human-centric video world model conditioned on tracked head and hand poses is introduced, enabling dexterous interactions through a bidirectional video diffusion model trained for egocentric virtual environment generation.

AI-generated summary

Extended reality (XR) demands generative models that respond to users' tracked real-world motion, yet current video world models accept only coarse control signals such as text or keyboard input, limiting their utility for embodied interaction. We introduce a human-centric video world model that is conditioned on both tracked head pose and joint-level hand poses. For this purpose, we evaluate existing diffusion transformer conditioning strategies and propose an effective mechanism for 3D head and hand control, enabling dexterous hand--object interactions. We train a bidirectional video diffusion model teacher using this strategy and distill it into a causal, interactive system that generates egocentric virtual environments. We evaluate this generated reality system with human subjects and demonstrate improved task performance as well as a significantly higher level of perceived amount of control over the performed actions compared with relevant baselines.

Community

Paper submitter

We introduce a human-centric video world model conditioned on head and hand poses, enabling interactive egocentric environments through bidirectional diffusion training and improved user control.

i am excited ๐Ÿ˜Š waiting for code release, this is most amazing things happen for vr, thank you for making it open source

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.18422 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.18422 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.18422 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.