Worldscape-MoE Model Weights
This repository contains the model weights introduced in the paper: Worldscape-MoE: A Unified Mixture-of-Experts World Model for Scalable Heterogeneous Action Control.
Worldscape-MoE is a Mixture-of-Experts world model for scalable heterogeneous action control. It unifies multiple control interfaces, including camera trajectories, robot actions, and hand-joint signals, within a shared world-dynamics backbone instead of treating each control modality as an isolated modeling problem.
Built on Diffusion Transformers, Worldscape-MoE combines modality-aware control injection, shared and control-specific experts, and a progressive MoE tuning strategy to absorb heterogeneous action supervision while preserving a shared model of world dynamics. Across locomotion, robotic manipulation, and egocentric hand control, it shows that heterogeneous supervision can improve rather than interfere with individual control capabilities, while supporting out-of-distribution generalization and continual extension to new action modalities.
For more details about the project, please refer to the project page: https://worldscape-moe.com
Demo video: https://youtu.be/c19iBwpdiv4
Citation
If this work has contributed to your research, please consider citing:
@misc{fang2026worldscapemoe,
title = {Worldscape-MoE: A Unified Mixture-of-Experts World Model for Scalable Heterogeneous Action Control},
author = {Jianjie Fang and Yongyan Xu and Ziyou Wang and Chen Gao and Yuchao Huang and Zhaolu Wang and Rongze Tang and Mingyuan Jia and Baining Zhao and Weichen Zhang and Xin Zhang and Haisheng Su and Yu Shang and Wei Wu and Xinlei Chen and Yong Li},
year = {2026},
note = {Project page: https://worldscape-moe.com}
}