SCoDe / README.md

Update README.md

cec16f4 verified 3 days ago

5.82 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- computer-vision
	- image-matching
	- overlap-detection
	- feature-extraction
	datasets:
	- SSSSphinx/SCoDe
	---

	# SCoDe: Scale-aware Co-visible Region Detection for Image Matching

	<div align="center">

	[![Paper](https://img.shields.io/badge/Paper-ScienceDirect-green)](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260)
	[![DOI](https://img.shields.io/badge/DOI-10.1016%2Fj.isprsjprs.2025.08.015-orange)](https://doi.org/10.1016/j.isprsjprs.2025.08.015)
	[![Project Page](https://img.shields.io/badge/Project-Website-blue)](https://xupan.top/Projects/scode)
	[![GitHub](https://img.shields.io/badge/Code-GitHub-black)](https://github.com/SSSSphinx/SCoDe)

	</div>

	## Overview

	SCoDe is a scale-aware co-visible region detection model designed for robust image matching. It detects overlapping regions between image pairs while being invariant to scale variations, making it particularly effective for structure-from-motion and 3D reconstruction tasks.

	This model is built upon the CCOE (Co-visible region detection with Overlap Estimation) architecture and has been trained on the MegaDepth dataset.

	## Model Details

	- Architecture: CCOE-based transformer with multi-scale attention
	- Backbone: ResNet-50
	- Input Size: 1024×1024 (configurable)
	- Training Dataset: MegaDepth
	- Framework: PyTorch

	### Key Features

	- Scale-aware overlap region detection
	- Rotation-invariant matching capabilities
	- End-to-end trainable pipeline
	- Compatible with various feature extractors (SIFT, SuperPoint, D2-Net, R2D2, DISK)

	## Usage

	### Installation

	```bash
	pip install torch torchvision
	git clone https://github.com/SSSSphinx/SCoDe.git
	cd SCoDe
	pip install -r requirements.txt
	```

	### Quick Start

	```python
	import torch
	from src.config.default import get_cfg_defaults
	from src.model import CCOE

	# Load configuration
	cfg = get_cfg_defaults()
	cfg.merge_from_file('configs/scode_config.py')

	# Initialize model
	device = 'cuda' if torch.cuda.is_available() else 'cpu'
	model = CCOE(cfg.CCOE).eval().to(device)

	# Load pre-trained weights
	model.load_state_dict(torch.load('weights/scode.pth', map_location=device))

	# Model is ready for inference
	with torch.no_grad():
	# Process image pair (example)
	image1 = torch.randn(1, 3, 1024, 1024).to(device)
	image2 = torch.randn(1, 3, 1024, 1024).to(device)
	output = model({'image1': image1, 'image2': image2})
	```

	### Training

	```bash
	# Single GPU training
	python train_scode.py --num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5

	# Multi-GPU distributed training (4 GPUs)
	python -m torch.distributed.launch --nproc_per_node 4 --master_port=29501 train_scode.py \
	--num_workers 4 --epoch 15 --batch_size 4 --validation --learning_rate 1e-5
	```

	### Evaluation

	#### Rotation Invariance Evaluation
	```bash
	python rot_inv_eval.py \
	--extractors superpoint d2net r2d2 disk \
	--image_pairs path/to/image/pairs \
	--output_dir outputs/scode_rot_eval
	```

	#### Pose Estimation Evaluation
	```bash
	python eval_pose_estimation.py \
	--results_dir outputs/megadepth_results \
	--dataset megadepth
	```

	#### Radar Evaluation
	```bash
	python eval_radar.py \
	--results_dir outputs/radar_results
	```

	## Configuration

	Main configuration files:
	- [`configs/scode_config.py`](configs/scode_config.py) - SCoDe model configuration
	- [`src/config/default.py`](src/config/default.py) - Default configuration template

	### Key Parameters

	```python
	# Training
	cfg.DATASET.TRAIN.IMAGE_SIZE = [1024, 1024]
	cfg.DATASET.TRAIN.BATCH_SIZE = 4
	cfg.DATASET.TRAIN.PAIRS_LENGTH = 128000

	# Validation
	cfg.DATASET.VAL.IMAGE_SIZE = [1024, 1024]

	# Model
	cfg.CCOE.BACKBONE.NUM_LAYERS = 50
	cfg.CCOE.BACKBONE.STRIDE = 32
	cfg.CCOE.CCA.DEPTH = [2, 2, 2, 2]
	cfg.CCOE.CCA.NUM_HEADS = [8, 8, 8, 8]
	```

	## Dataset

	The model is trained on the [MegaDepth](https://github.com/zhengqili/MegaDepth) dataset with scale-aware pair generation.

	Dataset preparation:
	```bash
	python dataset_preparation.py \
	--base_path dataset/megadepth/MegaDepth \
	--num_per_scene 5000
	```

	Validation pairs are automatically generated and evaluated during training.

	## Model Performance

	SCoDe demonstrates strong performance on:
	- Rotation Invariance: Robust to image rotations up to 360°
	- Scale Invariance: Effective across multiple image scales
	- Pose Estimation: Improved camera pose estimation on MegaDepth benchmark
	- Feature Matching: Enhanced matching accuracy with various feature extractors

	## Supported Feature Extractors

	The model works seamlessly with:
	- SIFT (with brute-force matcher)
	- SuperPoint (with NN matcher)
	- D2-Net
	- R2D2
	- DISK

	## Citation

	If you find this project useful in your research, please cite our paper:

	```bibtex
	@article{pan2025scale,
	title={Scale-aware co-visible region detection for image matching},
	author={Pan, Xu and Xia, Zimin and Zheng, Xianwei},
	journal={ISPRS Journal of Photogrammetry and Remote Sensing},
	volume={229},
	pages={122--137},
	year={2025},
	publisher={Elsevier}
	}
	```

	## License

	This project is licensed under the Apache-2.0 License. See the LICENSE file for details.

	## Acknowledgments

	- [MegaDepth](https://github.com/zhengqili/MegaDepth) - Dataset and benchmarks
	- [OETR](https://github.com/TencentYoutuResearch/ImageMatching-OETR) - Model initialization strategies
	- PyTorch team for the excellent framework

	## Contact

	For questions or issues, please visit the [GitHub repository](https://github.com/SSSSphinx/SCoDe) or contact the authors.

	---

	Paper: [Scale-aware Co-visible Region Detection for Image Matching](https://www.sciencedirect.com/science/article/abs/pii/S0924271625003260)
	Project Page: [https://xupan.top/Projects/scode](https://xupan.top/Projects/scode)