| | --- |
| | language: |
| | - en |
| | license: apache-2.0 |
| | tags: |
| | - computer-vision |
| | - image-matching |
| | - overlap-detection |
| | - feature-extraction |
| | datasets: |
| | - SSSSphinx/SCoDe |
| | --- |
| | |
| | # SCoDe: Scale-aware Co-visible Region Detection for Image Matching |
| |
|
| | <div align="center"> |
| |
|
| | [](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260) |
| | [](https://doi.org/10.1016/j.isprsjprs.2025.08.015) |
| | [](https://xupan.top/Projects/scode) |
| | [](https://github.com/SSSSphinx/SCoDe) |
| |
|
| | </div> |
| |
|
| | ## Overview |
| |
|
| | SCoDe is a scale-aware co-visible region detection model designed for robust image matching. It detects overlapping regions between image pairs while being invariant to scale variations, making it particularly effective for structure-from-motion and 3D reconstruction tasks. |
| |
|
| | This model is built upon the CCOE (Co-visible region detection with Overlap Estimation) architecture and has been trained on the MegaDepth dataset. |
| |
|
| | ## Model Details |
| |
|
| | - **Architecture**: CCOE-based transformer with multi-scale attention |
| | - **Backbone**: ResNet-50 |
| | - **Input Size**: 1024×1024 (configurable) |
| | - **Training Dataset**: MegaDepth |
| | - **Framework**: PyTorch |
| |
|
| | ### Key Features |
| |
|
| | - Scale-aware overlap region detection |
| | - Rotation-invariant matching capabilities |
| | - End-to-end trainable pipeline |
| | - Compatible with various feature extractors (SIFT, SuperPoint, D2-Net, R2D2, DISK) |
| |
|
| | ## Usage |
| |
|
| | ### Installation |
| |
|
| | ```bash |
| | pip install torch torchvision |
| | git clone https://github.com/SSSSphinx/SCoDe.git |
| | cd SCoDe |
| | pip install -r requirements.txt |
| | ``` |
| |
|
| | ### Quick Start |
| |
|
| | ```python |
| | import torch |
| | from src.config.default import get_cfg_defaults |
| | from src.model import CCOE |
| | |
| | # Load configuration |
| | cfg = get_cfg_defaults() |
| | cfg.merge_from_file('configs/scode_config.py') |
| | |
| | # Initialize model |
| | device = 'cuda' if torch.cuda.is_available() else 'cpu' |
| | model = CCOE(cfg.CCOE).eval().to(device) |
| | |
| | # Load pre-trained weights |
| | model.load_state_dict(torch.load('weights/scode.pth', map_location=device)) |
| | |
| | # Model is ready for inference |
| | with torch.no_grad(): |
| | # Process image pair (example) |
| | image1 = torch.randn(1, 3, 1024, 1024).to(device) |
| | image2 = torch.randn(1, 3, 1024, 1024).to(device) |
| | output = model({'image1': image1, 'image2': image2}) |
| | ``` |
| |
|
| | ### Training |
| |
|
| | ```bash |
| | # Single GPU training |
| | python train_scode.py --num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5 |
| | |
| | # Multi-GPU distributed training (4 GPUs) |
| | python -m torch.distributed.launch --nproc_per_node 4 --master_port=29501 train_scode.py \ |
| | --num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5 |
| | ``` |
| |
|
| | ### Evaluation |
| |
|
| | #### Rotation Invariance Evaluation |
| | ```bash |
| | python rot_inv_eval.py \ |
| | --extractors superpoint d2net r2d2 disk \ |
| | --image_pairs path/to/image/pairs \ |
| | --output_dir outputs/scode_rot_eval |
| | ``` |
| |
|
| | #### Pose Estimation Evaluation |
| | ```bash |
| | python eval_pose_estimation.py \ |
| | --results_dir outputs/megadepth_results \ |
| | --dataset megadepth |
| | ``` |
| |
|
| | #### Radar Evaluation |
| | ```bash |
| | python eval_radar.py \ |
| | --results_dir outputs/radar_results |
| | ``` |
| |
|
| | ## Configuration |
| |
|
| | Main configuration files: |
| | - [`configs/scode_config.py`](configs/scode_config.py) - SCoDe model configuration |
| | - [`src/config/default.py`](src/config/default.py) - Default configuration template |
| |
|
| | ### Key Parameters |
| |
|
| | ```python |
| | # Training |
| | cfg.DATASET.TRAIN.IMAGE_SIZE = [1024, 1024] |
| | cfg.DATASET.TRAIN.BATCH_SIZE = 4 |
| | cfg.DATASET.TRAIN.PAIRS_LENGTH = 128000 |
| | |
| | # Validation |
| | cfg.DATASET.VAL.IMAGE_SIZE = [1024, 1024] |
| | |
| | # Model |
| | cfg.CCOE.BACKBONE.NUM_LAYERS = 50 |
| | cfg.CCOE.BACKBONE.STRIDE = 32 |
| | cfg.CCOE.CCA.DEPTH = [2, 2, 2, 2] |
| | cfg.CCOE.CCA.NUM_HEADS = [8, 8, 8, 8] |
| | ``` |
| |
|
| | ## Dataset |
| |
|
| | The model is trained on the [MegaDepth](https://github.com/zhengqili/MegaDepth) dataset with scale-aware pair generation. |
| |
|
| | Dataset preparation: |
| | ```bash |
| | python dataset_preparation.py \ |
| | --base_path dataset/megadepth/MegaDepth \ |
| | --num_per_scene 5000 |
| | ``` |
| |
|
| | Validation pairs are automatically generated and evaluated during training. |
| |
|
| | ## Model Performance |
| |
|
| | SCoDe demonstrates strong performance on: |
| | - **Rotation Invariance**: Robust to image rotations up to 360° |
| | - **Scale Invariance**: Effective across multiple image scales |
| | - **Pose Estimation**: Improved camera pose estimation on MegaDepth benchmark |
| | - **Feature Matching**: Enhanced matching accuracy with various feature extractors |
| |
|
| | ## Supported Feature Extractors |
| |
|
| | The model works seamlessly with: |
| | - SIFT (with brute-force matcher) |
| | - SuperPoint (with NN matcher) |
| | - D2-Net |
| | - R2D2 |
| | - DISK |
| |
|
| | ## Citation |
| |
|
| | If you find this project useful in your research, please cite our paper: |
| |
|
| | ```bibtex |
| | @article{pan2025scale, |
| | title={Scale-aware co-visible region detection for image matching}, |
| | author={Pan, Xu and Xia, Zimin and Zheng, Xianwei}, |
| | journal={ISPRS Journal of Photogrammetry and Remote Sensing}, |
| | volume={229}, |
| | pages={122--137}, |
| | year={2025}, |
| | publisher={Elsevier} |
| | } |
| | ``` |
| |
|
| | ## License |
| |
|
| | This project is licensed under the Apache-2.0 License. See the LICENSE file for details. |
| |
|
| | ## Acknowledgments |
| |
|
| | - [MegaDepth](https://github.com/zhengqili/MegaDepth) - Dataset and benchmarks |
| | - [OETR](https://github.com/TencentYoutuResearch/ImageMatching-OETR) - Model initialization strategies |
| | - PyTorch team for the excellent framework |
| |
|
| | ## Contact |
| |
|
| | For questions or issues, please visit the [GitHub repository](https://github.com/SSSSphinx/SCoDe) or contact the authors. |
| |
|
| | --- |
| |
|
| | **Paper**: [Scale-aware Co-visible Region Detection for Image Matching](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260) |
| | **Project Page**: [https://xupan.top/Projects/scode](https://xupan.top/Projects/scode) |