Papers
arxiv:2507.07871

Mitigating Watermark Forgery in Generative Models via Randomized Key Selection

Published on May 11
Authors:
,
,
,
,
,
,
,

Abstract

A provably forgery-resistant watermarking defense is proposed that randomly selects watermark keys for each query and accepts content as genuine only if exactly one key detects a watermark, effectively reducing attack success rates while maintaining model utility.

AI-generated summary

Watermarking enables GenAI providers to verify whether content was generated by their models. A watermark is a hidden signal in the content, whose presence can be detected using a secret watermark key. A core security threat are forgery attacks, where adversaries insert the provider's watermark into content not produced by the provider, potentially damaging their reputation and undermining trust. Existing defenses resist forgery by embedding many watermarks with multiple keys into the same content, which can degrade model utility. However, forgery remains a threat when attackers can collect sufficiently many watermarked samples. We propose a defense that is provably forgery-resistant independent of the number of watermarked content collected by the attacker, provided they cannot easily distinguish watermarks from different keys. Our scheme does not further degrade model utility. We randomize the watermark key selection for each query and accept content as genuine only if a watermark is detected by exactly one key. We focus on the image and text modalities, but our defense is modality-agnostic, since it treats the underlying watermarking method as a black-box. Our method provably bounds the attacker's success rate and we empirically observe a reduction from near-perfect success rates to only 2% at negligible computational overhead.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2507.07871
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2507.07871 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2507.07871 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2507.07871 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.