haoningwu
/

SpatialScore

Safetensors

Model card Files Files and versions

xet

Community

haoningwu commited on May 28

Commit

385aa4d

verified ·

1 Parent(s): 7b9ef39

Update README.md

Browse files

Files changed (1) hide show

README.md +1 -92

README.md CHANGED Viewed

@@ -45,99 +45,8 @@ conda env create -f environment.yaml
 conda activate SpatialScore
 ```
-## Dataset
-Please check out [SpaitalScore](https://huggingface.co/datasets/haoningwu/SpatialScore) to download our proposed benchmark (`SpatialScore`).
-If you cannot access Huggingface, you can use [hf-mirror](https://hf-mirror.com/) to download models.
-```
-export HF_ENDPOINT=https://hf-mirror.com # Add this before huggingface-cli download
-```
-You can follow the commands below to prepare the data:
-```
-huggingface-cli download --resume-download --repo-type dataset haoningwu/SpatialScore --local-dir ./ --local-dir-use-symlinks False
-unzip SpatialScore_benchmark.zip
-```
-## Evaluation
-Considering the current mainstream model architectures, we have prioritized support for the Qwen2.5-VL and Qwen3-VL series models.
-You can evaluate them on SpatialScore using the following commands:
-```
-CUDA_VISIBLE_DEVICES=0,1 python test_qwen.py --model_name qwen3vl-4b --model_path ./huggingface/Qwen3-VL-4B-Instruct --dataset_json_path ./SpatialScore_benchmark/SpatialScore_benchmark.ndjson --output_dir ./eval_results
-```
-Now, the All-in-one script supporting all other models is also available.
-You can evaluate other models on SpatialScore using the following commands:
-```
-CUDA_VISIBLE_DEVICES=0,1 python test_all_in_one.py --model_name llava-ov-7b --model_path ../huggingface/LLaVA-OneVision-7B --dataset_json_path ./SpatialScore_benchmark/SpatialScore_benchmark.ndjson --output_dir ./eval_results
-```
-Our final evaluation encompassed rule-based evaluation and LLM-based answer extraction, which are combined to calculate the final accuracy.
-Therefore, you need to configure [GPT-OSS](https://github.com/openai/gpt-oss) and download the corresponding [GPT-OSS-20B](https://huggingface.co/openai/gpt-oss-20b) checkpoint before running the following script to compute the final score:
-```
-MKL_THREADING_LAYER=GNU CUDA_VISIBLE_DEVICES=0 python ./evaluate_results.py --input ./eval_results/qwen3vl-4b
-```
-## Inference with SpatialAgent
-Before using SpatialAgent, you need to install the additional dependencies required by the toolbox according to the Requirements section.
-In addition, you should download the checkpoints for the spatial perception tools being used and place them in the `./SpatialAgent/checkpoints/` directory, which should have a structure similar to the following:
-```
-./SpatialAgent/checkpoints
-├── dinov2-large
-├── Orient-Anything
-│   ├── base100p
-│   ├── base100p2
-│   ├── base25p
-│   ├── base50p
-│   ├── base75p
-│   ├── base75p2
-│   ├── celarge
-│   ├── cropbaseEx03
-│   ├── croplargeEX03
-│   ├── croplargeEX2
-│   ├── cropsmallEx03
-│   ├── mixreallarge
-│   └── ronormsigma1
-└── RAFT
-./SpatialAgent/DepthAnythingV2
-└── ckpt
-│   ├── hypersim.pth
-│   └── vkitti.pth
-./SpatialAgent/DetAny3D
-├── GroundingDINO
-│   └── weights
-│       └── groundingdino_swinb_cogcoor.pth
-├── checkpoints/detany3d
-│   ├── detany3d_ckpts
-│   ├── dino_ckpts
-│   ├── sam_ckpts
-│   └── unidepth_ckpts
-└── models--bert-base-uncased
-```
-Furthermore, for [DetAny3D](https://github.com/OpenDriveLab/DetAny3D) and [DepthAnythingV2](https://github.com/DepthAnything/Depth-Anything-V2), you will also need to refer to their respective repositories, download the required checkpoints, and place them in their corresponding directories.
-Our SpatialAgent supports two reasoning paradigms: Plan-Execute and ReAct. You can perform inference using the following script:
-```
-# Plan-Execute paradigm
-CUDA_VISIBLE_DEVICES=0  python inference_plan-execute.py --start 0 --end 1000 --prompt_format cota --model_path ../huggingface/Qwen3-VL-4B-Instruct --model_name qwen3vl-4b
-# ReAct paradigm
-CUDA_VISIBLE_DEVICES=0  python inference_ReAct.py --start 0 --end 1000 --execute --prompt_format cota --model_path ../huggingface/Qwen3-VL-4B-Instruct --model_name qwen3vl-4b
-```
 ## Citation
-If you use this code and data for your research or project, please cite:
 	@inproceedings{wu2026spatialscore,
       author    = {Wu, Haoning and Huang, Xiao and Chen, Yaohui and Zhang, Ya and Wang, Yanfeng and Xie, Weidi},

 conda activate SpatialScore
 ```
 ## Citation
+If you use this code, model, and data for your research or project, please cite:
 	@inproceedings{wu2026spatialscore,
       author    = {Wu, Haoning and Huang, Xiao and Chen, Yaohui and Zhang, Ya and Wang, Yanfeng and Xie, Weidi},