--- language: - en license: apache-2.0 tags: - computer-vision - image-matching - overlap-detection - feature-extraction datasets: - SSSSphinx/SCoDe --- # SCoDe: Scale-aware Co-visible Region Detection for Image Matching
[![Paper](https://img.shields.io/badge/Paper-ScienceDirect-green)](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260) [![DOI](https://img.shields.io/badge/DOI-10.1016%2Fj.isprsjprs.2025.08.015-orange)](https://doi.org/10.1016/j.isprsjprs.2025.08.015) [![Project Page](https://img.shields.io/badge/Project-Website-blue)](https://xupan.top/Projects/scode) [![GitHub](https://img.shields.io/badge/Code-GitHub-black)](https://github.com/SSSSphinx/SCoDe)
## Overview SCoDe is a scale-aware co-visible region detection model designed for robust image matching. It detects overlapping regions between image pairs while being invariant to scale variations, making it particularly effective for structure-from-motion and 3D reconstruction tasks. This model is built upon the CCOE (Co-visible region detection with Overlap Estimation) architecture and has been trained on the MegaDepth dataset. ## Model Details - **Architecture**: CCOE-based transformer with multi-scale attention - **Backbone**: ResNet-50 - **Input Size**: 1024×1024 (configurable) - **Training Dataset**: MegaDepth - **Framework**: PyTorch ### Key Features - Scale-aware overlap region detection - Rotation-invariant matching capabilities - End-to-end trainable pipeline - Compatible with various feature extractors (SIFT, SuperPoint, D2-Net, R2D2, DISK) ## Usage ### Installation ```bash pip install torch torchvision git clone https://github.com/SSSSphinx/SCoDe.git cd SCoDe pip install -r requirements.txt ``` ### Quick Start ```python import torch from src.config.default import get_cfg_defaults from src.model import CCOE # Load configuration cfg = get_cfg_defaults() cfg.merge_from_file('configs/scode_config.py') # Initialize model device = 'cuda' if torch.cuda.is_available() else 'cpu' model = CCOE(cfg.CCOE).eval().to(device) # Load pre-trained weights model.load_state_dict(torch.load('weights/scode.pth', map_location=device)) # Model is ready for inference with torch.no_grad(): # Process image pair (example) image1 = torch.randn(1, 3, 1024, 1024).to(device) image2 = torch.randn(1, 3, 1024, 1024).to(device) output = model({'image1': image1, 'image2': image2}) ``` ### Training ```bash # Single GPU training python train_scode.py --num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5 # Multi-GPU distributed training (4 GPUs) python -m torch.distributed.launch --nproc_per_node 4 --master_port=29501 train_scode.py \ --num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5 ``` ### Evaluation #### Rotation Invariance Evaluation ```bash python rot_inv_eval.py \ --extractors superpoint d2net r2d2 disk \ --image_pairs path/to/image/pairs \ --output_dir outputs/scode_rot_eval ``` #### Pose Estimation Evaluation ```bash python eval_pose_estimation.py \ --results_dir outputs/megadepth_results \ --dataset megadepth ``` #### Radar Evaluation ```bash python eval_radar.py \ --results_dir outputs/radar_results ``` ## Configuration Main configuration files: - [`configs/scode_config.py`](configs/scode_config.py) - SCoDe model configuration - [`src/config/default.py`](src/config/default.py) - Default configuration template ### Key Parameters ```python # Training cfg.DATASET.TRAIN.IMAGE_SIZE = [1024, 1024] cfg.DATASET.TRAIN.BATCH_SIZE = 4 cfg.DATASET.TRAIN.PAIRS_LENGTH = 128000 # Validation cfg.DATASET.VAL.IMAGE_SIZE = [1024, 1024] # Model cfg.CCOE.BACKBONE.NUM_LAYERS = 50 cfg.CCOE.BACKBONE.STRIDE = 32 cfg.CCOE.CCA.DEPTH = [2, 2, 2, 2] cfg.CCOE.CCA.NUM_HEADS = [8, 8, 8, 8] ``` ## Dataset The model is trained on the [MegaDepth](https://github.com/zhengqili/MegaDepth) dataset with scale-aware pair generation. Dataset preparation: ```bash python dataset_preparation.py \ --base_path dataset/megadepth/MegaDepth \ --num_per_scene 5000 ``` Validation pairs are automatically generated and evaluated during training. ## Model Performance SCoDe demonstrates strong performance on: - **Rotation Invariance**: Robust to image rotations up to 360° - **Scale Invariance**: Effective across multiple image scales - **Pose Estimation**: Improved camera pose estimation on MegaDepth benchmark - **Feature Matching**: Enhanced matching accuracy with various feature extractors ## Supported Feature Extractors The model works seamlessly with: - SIFT (with brute-force matcher) - SuperPoint (with NN matcher) - D2-Net - R2D2 - DISK ## Citation If you find this project useful in your research, please cite our paper: ```bibtex @article{pan2025scale, title={Scale-aware co-visible region detection for image matching}, author={Pan, Xu and Xia, Zimin and Zheng, Xianwei}, journal={ISPRS Journal of Photogrammetry and Remote Sensing}, volume={229}, pages={122--137}, year={2025}, publisher={Elsevier} } ``` ## License This project is licensed under the Apache-2.0 License. See the LICENSE file for details. ## Acknowledgments - [MegaDepth](https://github.com/zhengqili/MegaDepth) - Dataset and benchmarks - [OETR](https://github.com/TencentYoutuResearch/ImageMatching-OETR) - Model initialization strategies - PyTorch team for the excellent framework ## Contact For questions or issues, please visit the [GitHub repository](https://github.com/SSSSphinx/SCoDe) or contact the authors. --- **Paper**: [Scale-aware Co-visible Region Detection for Image Matching](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260) **Project Page**: [https://xupan.top/Projects/scode](https://xupan.top/Projects/scode)