Papers
arxiv:2407.12331

I2AM: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Maps

Published on Jul 17, 2024
Authors:
,

Abstract

Image-to-Image Attribution Maps (I2AM) enhances interpretability of diffusion models by visualizing bidirectional feature transfer through aggregated cross-attention scores across time steps, attention heads, and layers, demonstrating effectiveness in object detection, inpainting, and super-resolution tasks.

AI-generated summary

Large-scale diffusion models have made significant advances in image generation, particularly through cross-attention mechanisms. While cross-attention has been well-studied in text-to-image tasks, their interpretability in image-to-image (I2I) diffusion models remains underexplored. This paper introduces Image-to-Image Attribution Maps (I2AM), a method that enhances the interpretability of I2I models by visualizing bidirectional attribution maps, from the reference image to the generated image and vice versa. I2AM aggregates cross-attention scores across time steps, attention heads, and layers, offering insights into how critical features are transferred between images. We demonstrate the effectiveness of I2AM across object detection, inpainting, and super-resolution tasks. Our results demonstrate that I2AM successfully identifies key regions responsible for generating the output, even in complex scenes. Additionally, we introduce the Inpainting Mask Attention Consistency Score (IMACS) as a novel evaluation metric to assess the alignment between attribution maps and inpainting masks, which correlates strongly with existing performance metrics. Through extensive experiments, we show that I2AM enables model debugging and refinement, providing practical tools for improving I2I model's performance and interpretability.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2407.12331 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2407.12331 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2407.12331 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.