arxiv:2602.03210

VIRAL: Visual In-Context Reasoning via Analogy in Diffusion Transformers

Published on Feb 3

Authors:

Abstract

VIRAL is a framework that uses visual analogy and diffusion transformers to replicate in-context learning in computer vision, achieving superior performance across diverse visual tasks through role-aware conditioning and expert mixing techniques.

AI-generated summary

Replicating In-Context Learning (ICL) in computer vision remains challenging due to task heterogeneity. We propose VIRAL, a framework that elicits visual reasoning from a pre-trained image editing model by formulating ICL as conditional generation via visual analogy (x_s : x_t :: x_q : y_q). We adapt a frozen Diffusion Transformer (DiT) using role-aware multi-image conditioning and introduce a Mixture-of-Experts LoRA to mitigate gradient interference across diverse tasks. Additionally, to bridge the gaps in current visual context datasets, we curate a large-scale dataset spanning perception, restoration, and editing. Experiments demonstrate that VIRAL outperforms existing methods, validating that a unified V-ICL paradigm can handle the majority of visual tasks, including open-domain editing. Our code is available at https://anonymous.4open.science/r/VIRAL-744A

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2602.03210

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.03210 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.03210 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.03210 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.