| license: apache-2.0 | |
| pipeline_tag: zero-shot-image-classification | |
| # Towards Highly Transferable Vision-Language Attack via Semantic-Augmented Dynamic Contrastive Interaction | |
| This repository contains the official implementation of **SADCA** (Semantic-Augmented Dynamic Contrastive Attack), presented in the paper [Towards Highly Transferable Vision-Language Attack via Semantic-Augmented Dynamic Contrastive Interaction](https://arxiv.org/abs/2603.04839). | |
| SADCA is a framework designed to enhance the transferability of adversarial attacks against vision-language pre-training (VLP) models. It progressively disrupts cross-modal alignment through dynamic interactions between adversarial images and texts, using a contrastive learning mechanism involving adversarial, positive, and negative samples to reinforce semantic inconsistency. | |
| ## Links | |
| - **Paper**: [https://arxiv.org/abs/2603.04839](https://arxiv.org/abs/2603.04839) | |
| - **GitHub**: [https://github.com/LiYuanBoJNU/SADCA](https://github.com/LiYuanBoJNU/SADCA) | |
| ## Citation | |
| ```bibtex | |
| @article{li2026towards, | |
| title={Towards Highly Transferable Vision-Language Attack via Semantic-Augmented Dynamic Contrastive Interaction}, | |
| author={Li, Yuanbo and Xu, Tianyang and Hu, Cong and Zhou, Tao and Wu, Xiao-Jun and Kittler, Josef}, | |
| journal={arXiv preprint arXiv:2603.04839}, | |
| year={2026} | |
| } | |
| ``` |