| | --- |
| | license: creativeml-openrail-m |
| | datasets: |
| | - DarthReca/crisislandmark |
| | language: |
| | - en |
| | library_name: transformers |
| | tags: |
| | - remote-sensing |
| | - text-to-image-retrieval |
| | - multimodal |
| | - geospatial |
| | - SAR |
| | - multispectral |
| | - crisis-management |
| | - earth-observation |
| | - contrastive-learning |
| | base_model: |
| | - sentence-transformers/all-MiniLM-L6-v2 |
| | --- |
| | # CLOSP-VL |
| |
|
| | CLOSP (Contrastive Language Optical SAR Pretraining) is a multimodal architecture designed for text-to-image retrieval. |
| | It creates a unified embedding space for text, Sentinel-2 (MSI), and Sentinel-1 (SAR) data. |
| | The CLOSP-VL variant uses a ViT-large vision backbone. |
| |
|
| | ## Model Details |
| | The model uses three separate encoders: one for text, one for Sentinel-1 (SAR) data, and one for Sentinel-2 (MSI) data. |
| | During training, it uses a contrastive objective to align the textual embeddings with the corresponding visual embeddings (either SAR or MSI). |
| |
|
| |
|
| | - **Developed by:** Daniele Rege Cambrin |
| | - **Model type:** CLOSP |
| | - **Language(s) (NLP):** english |
| | - **License:** OpenRAIL |
| | - **Finetuned from model:** [More Information Needed] |
| | - **Repository:** [GitHub](https://github.com/DarthReca/closp) |
| | - **Paper:** [ArXiv](https://arxiv.org/abs/2507.10403) |
| |
|
| | ## How to Get Started with the Model |
| |
|
| | Use the code below to get started with the model. |
| |
|
| | ```python |
| | model = AutoModel.from_pretrained("DarthReca/CLOSP-VL", trust_remote_code=True) |
| | ``` |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @misc{cambrin2025texttoremotesensingimageretrievalrgbsources, |
| | title={Text-to-Remote-Sensing-Image Retrieval beyond RGB Sources}, |
| | author={Daniele Rege Cambrin and Lorenzo Vaiani and Giuseppe Gallipoli and Luca Cagliero and Paolo Garza}, |
| | year={2025}, |
| | eprint={2507.10403}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CV}, |
| | url={https://arxiv.org/abs/2507.10403}, |
| | } |
| | ``` |
| |
|
| | ## Licensing |
| | The data in this dataset is a compilation of multiple sources, each with its own license. For detailed information on the licensing of each component, please see the [**NOTICE.md**](NOTICE.md) file. |