# SOTA COCO Object Detection with PE
## Getting started
Please refer to [INSTALL.md](../INSTALL.md) for installation and dataset preparation instructions.
Also install [Deformable Attention](models/ops/make.sh) ops.
## Results and Fine-tuned Models
| detector |
vision encoder |
box AP |
box(TTA) AP |
download |
| DETA |
PE spatial G |
65.2 |
66.0 |
model |
## Training
We apply a four-stage training, Objects365(12ep, 1024pix), Objects365(6ep, 1536pix), COCO(12ep, 1728pix), COCO(3ep, 1824pix)
```
sbatch scripts/pretrain_spatial_Gwin384_o365ep12_1024pix_16node.sh
sbatch scripts/pretrain_continue_spatial_Gwin384_o365ep6_1536pix_16node.sh
sbatch scripts/finetune_spatial_Gwin384_cocoep12_1728pix_8node.sh
sbatch scripts/finetune_further_spatial_Gwin384_cocoep3_1824pix_8node.sh
```
## Evaluation
```
bash scripts/eval_1824pix.sh --resume deta_coco_1824pix.pth
```
## Evaluation with TTA (Test-Time Augmentation)
```
sbatch scripts/eval_tta_slurm_1824pix.sh --resume deta_coco_1824pix.pth
```
Note: If you get 65.9 AP, it is probably caused by different package versions, trying different hyperparameters like `--quad_scale 0.4` will give 66.0 AP.