CooperScene: Multi-Modal Cooperative Autonomy Benchmark with C-V2X Communication Characterization

Introduction

🚗 This repository hosts the model configs and pre-trained checkpoints for CooperScene — the first real-world, multi-agent, multi-modal cooperative autonomy dataset with C-V2X communication characterization (three connected vehicles + one roadside unit, across intersections, highway ramps, and parking areas).

🚀 All training and inference code is open-sourced. See the project page and the GitHub repo for details.

💬 We welcome feedback and look forward to your comments!

What's here

Each model has its config and matching checkpoint together under configs/<model>/:

Cooperative detectors	BEVFusion
`cobevt`	`bevfusion_single_lidar`
`cosdh`	`bevfusion_single_lidarcam`
`ermvp`	`bevfusion_coop_lidar`
`v2vam`	`bevfusion_coop_lidarcam`
`v2vnet`
`v2xvit`

All models run on a unified mmengine pipeline (proj_first=True, same global-sort BEV/3D polygon-IoU AP @ 0.3 / 0.5 / 0.7), so the numbers are directly comparable.

Download

pip install -U huggingface_hub
hf download cisl-hf/CooperScene --local-dir assets
# -> assets/configs/<model>/{<model>.py, <model>.pth}

Usage

Clone the code repo, then evaluate or train with a downloaded config + checkpoint:

# evaluate (test split by default)
python tools/test.py assets/configs/ermvp/ermvp.py assets/configs/ermvp/ermvp.pth

# train (warm-start from a checkpoint, optional)
python tools/train.py assets/configs/ermvp/ermvp.py

See the GitHub README for data preparation and the Docker workflow.

cisl-hf
/

CooperScene

CooperScene: Multi-Modal Cooperative Autonomy Benchmark with C-V2X Communication Characterization

Introduction

What's here

Download

Usage

Related links