Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / diffusers /pr_12411 /en /using-diffusers /t2i_adapter.md

rtrm

15 days ago

preview code

download

raw

4.19 kB

	# T2I-Adapter

	[T2I-Adapter](https://huggingface.co/papers/2302.08453) is an adapter that enables controllable generation like [ControlNet](./controlnet). A T2I-Adapter works by learning a mapping between a control signal (for example, a depth map) and a pretrained model's internal knowledge. The adapter is plugged in to the base model to provide extra guidance based on the control signal during generation.

	Load a T2I-Adapter conditioned on a specific control, such as canny edge, and pass it to the pipeline in [from_pretrained()](/docs/diffusers/pr_12411/en/api/pipelines/overview#diffusers.DiffusionPipeline.from_pretrained).

	```py
	import torch
	from diffusers import T2IAdapter, StableDiffusionXLAdapterPipeline, AutoencoderKL

	t2i_adapter = T2IAdapter.from_pretrained(
	"TencentARC/t2i-adapter-canny-sdxl-1.0",
	torch_dtype=torch.float16,
	)
	```

	Generate a canny image with [opencv-python](https://github.com/opencv/opencv-python).

	```py
	import cv2
	import numpy as np
	from PIL import Image
	from diffusers.utils import load_image

	original_image = load_image(
	"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/non-enhanced-prompt.png"
	)

	image = np.array(original_image)

	low_threshold = 100
	high_threshold = 200

	image = cv2.Canny(image, low_threshold, high_threshold)
	image = image[:, :, None]
	image = np.concatenate([image, image, image], axis=2)
	canny_image = Image.fromarray(image)
	```

	Pass the canny image to the pipeline to generate an image.

	```py
	vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
	pipeline = StableDiffusionXLAdapterPipeline.from_pretrained(
	"stabilityai/stable-diffusion-xl-base-1.0",
	adapter=t2i_adapter,
	vae=vae,
	torch_dtype=torch.float16,
	).to("cuda")

	prompt = """
	A photorealistic overhead image of a cat reclining sideways in a flamingo pool floatie holding a margarita.
	The cat is floating leisurely in the pool and completely relaxed and happy.
	"""

	pipeline(
	prompt,
	image=canny_image,
	num_inference_steps=100,
	guidance_scale=10,
	).images[0]
	```



	original image



	canny image



	generated image


	## MultiAdapter

	You can compose multiple controls, such as canny image and a depth map, with the `MultiAdapter` class.

	The example below composes a canny image and depth map.

	Load the control images and T2I-Adapters as a list.

	```py
	import torch
	from diffusers.utils import load_image
	from diffusers import StableDiffusionXLAdapterPipeline, AutoencoderKL, MultiAdapter, T2IAdapter

	canny_image = load_image(
	"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/canny-cat.png"
	)
	depth_image = load_image(
	"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/sdxl_depth_image.png"
	)
	controls = [canny_image, depth_image]
	prompt = ["""
	a relaxed rabbit sitting on a striped towel next to a pool with a tropical drink nearby,
	bright sunny day, vacation scene, 35mm photograph, film, professional, 4k, highly detailed
	"""]

	adapters = MultiAdapter(
	[
	T2IAdapter.from_pretrained("TencentARC/t2i-adapter-canny-sdxl-1.0", torch_dtype=torch.float16),
	T2IAdapter.from_pretrained("TencentARC/t2i-adapter-depth-midas-sdxl-1.0", torch_dtype=torch.float16),
	]
	)
	```

	Pass the adapters, prompt, and control images to [StableDiffusionXLAdapterPipeline](/docs/diffusers/pr_12411/en/api/pipelines/stable_diffusion/adapter#diffusers.StableDiffusionXLAdapterPipeline). Use the `adapter_conditioning_scale` parameter to determine how much weight to assign to each control.

	```py
	vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
	pipeline = StableDiffusionXLAdapterPipeline.from_pretrained(
	"stabilityai/stable-diffusion-xl-base-1.0",
	torch_dtype=torch.float16,
	vae=vae,
	adapter=adapters,
	).to("cuda")

	pipeline(
	prompt,
	image=controls,
	height=1024,
	width=1024,
	adapter_conditioning_scale=[0.7, 0.7]
	).images[0]
	```



	canny image



	depth map



	generated image

Xet Storage Details

Size:: 4.19 kB
Xet hash:: d9257eb91549b4cd7734a8760e5aea2d6152bb31cbcfbb42095fca07cd3e3dba

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.