| # SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding (CVPR 2026 Highlight) |
| This repository contains the official PyTorch implementation of SpatialScore: https://arxiv.org/abs/2505.17012/. |
|
|
| Our new version paper has been accepted by CVPR 2026, and we have updated our up-to-date code and data! |
| Feel free to reach out for discussions! |
|
|
| <div align="center"> |
| <img src="./assets/dataset.png"> |
| </div> |
|
|
| Current Leaderboard (You are welcome to test your models on SpatialScore!): |
|
|
| <div align="center"> |
| <img src="./assets/SpatialScore.png"> |
| </div> |
|
|
| ## Some Information |
| [Project Page](https://haoningwu3639.github.io/SpatialScore/) 路 [Paper](https://arxiv.org/abs/2505.17012/) 路 [SpatialScore_Benchmark](https://huggingface.co/datasets/haoningwu/SpatialScore) 路 [SpatialCorpus](https://huggingface.co/datasets/haoningwu/SpatialCorpus) 路 [Model](https://huggingface.co/haoningwu/SpatialScore) |
|
|
| ## News |
| - [2026.5] We have updated our up-to-date code and data! |
| - [2026.4] Glad to share that **SpatialScore** has been accepted to **CVPR 2026** and selected as **Highlight**. |
| - [2025.5] ~~We have released version_0 of our evaluation code, supporting most mainstream models.~~ |
| - [2025.5] ~~We have released version_0 of SpatialScore, which is available on [Huggingface](https://huggingface.co/datasets/haoningwu/SpatialScore).~~ |
| - [2025.5] Our pre-print paper is released on arXiv. |
|
|
| ## Requirements |
| - Python >= 3.10 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html)) |
| - [PyTorch >= 2.8.0](https://pytorch.org/) |
| - accelerate == 1.13.0 |
| - xformers==0.0.32.post1 |
| - flash-attn==2.8.2 |
| - vllm == 0.11.0 |
| - triton == 3.4.0 |
| - triton_kernels (please refer to [gpt_oss](https://wheels.vllm.ai/gpt-oss/triton-kernels/) for version supporting gpt_oss) |
| - transformers == 4.57.3 |
|
|
| The aforementioned dependencies are necessary for conducting evaluations on SpatialScore. |
| If you intend to utilize SpatialAgent; since it requires invoking various spatial perception tools, you may need to consult the following repositories to install the corresponding tool dependencies, and download their corresponding pre-trained checkpoints, including [Rex-Omni](https://github.com/IDEA-Research/Rex-Omni), [Map-Anything](https://github.com/facebookresearch/map-anything), [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO) and [DetAny3D](https://github.com/OpenDriveLab/DetAny3D). |
|
|
| A suitable [conda](https://conda.io/) environment named `SpatialScore` can be created and activated with: |
|
|
| ``` |
| conda env create -f environment.yaml |
| conda activate SpatialScore |
| ``` |
|
|
| ## Citation |
| If you use this code, model, and data for your research or project, please cite: |
|
|
| @inproceedings{wu2026spatialscore, |
| author = {Wu, Haoning and Huang, Xiao and Chen, Yaohui and Zhang, Ya and Wang, Yanfeng and Xie, Weidi}, |
| title = {SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence}, |
| booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, |
| year = {2026}, |
| } |
| |
| ## TODO |
| - [x] Release Paper |
| - [x] Update the final version paper |
| - [x] Release version_0 SpatialScore Benchmark |
| - [x] Release version_0 Code of Evaluation |
| - [x] Release version_0 Base Code of SpatialAgent |
| - [x] Release our training resources SpatialCorpus and the SFT models |
| - [x] Update SpatialScore Benchmark |
| - [x] Update Code of Evaluation |
| - [x] Update Code of SpatialAgent |
| |
| ## Acknowledgements |
| Many thanks to the code bases from [transformers](https://github.com/huggingface/transformers), [Qwen3-VL](https://github.com/qwenlm/qwen3-vl), and [TACO](https://github.com/SalesforceAIResearch/TACO). |
| |
| |
| ## Contact |
| If you have any questions, please feel free to contact haoningwu3639@gmail.com. |
| |