Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"Pipeline callbacks","local":"pipeline-callbacks","sections":[{"title":"Official callbacks","local":"official-callbacks","sections":[],"depth":2},{"title":"Dynamic classifier-free guidance","local":"dynamic-classifier-free-guidance","sections":[],"depth":2},{"title":"Interrupt the diffusion process","local":"interrupt-the-diffusion-process","sections":[],"depth":2},{"title":"IP Adapter Cutoff","local":"ip-adapter-cutoff","sections":[],"depth":2},{"title":"Display image after each generation step","local":"display-image-after-each-generation-step","sections":[],"depth":2}],"depth":1}"> | |
| <link href="/docs/diffusers/pr_11986/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/entry/start.e4c568fa.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/chunks/scheduler.8c3d61f6.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/chunks/singletons.59c211be.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/chunks/index.0997d446.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/chunks/paths.38f9cd8e.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/entry/app.29bdf47f.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/chunks/index.da70eac4.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/nodes/0.f8d125de.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/nodes/280.5f13f990.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/chunks/Tip.1d9b8c37.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/chunks/CodeBlock.a9c4becf.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/pr_11986/en/_app/immutable/chunks/getInferenceSnippets.366c2c95.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"Pipeline callbacks","local":"pipeline-callbacks","sections":[{"title":"Official callbacks","local":"official-callbacks","sections":[],"depth":2},{"title":"Dynamic classifier-free guidance","local":"dynamic-classifier-free-guidance","sections":[],"depth":2},{"title":"Interrupt the diffusion process","local":"interrupt-the-diffusion-process","sections":[],"depth":2},{"title":"IP Adapter Cutoff","local":"ip-adapter-cutoff","sections":[],"depth":2},{"title":"Display image after each generation step","local":"display-image-after-each-generation-step","sections":[],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h1 class="relative group"><a id="pipeline-callbacks" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#pipeline-callbacks"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Pipeline callbacks</span></h1> <p data-svelte-h="svelte-1ex9ief">The denoising loop of a pipeline can be modified with custom defined functions using the <code>callback_on_step_end</code> parameter. The callback function is executed at the end of each step, and modifies the pipeline attributes and variables for the next step. This is really useful for <em>dynamically</em> adjusting certain pipeline attributes or modifying tensor variables. This versatility allows for interesting use cases such as changing the prompt embeddings at each timestep, assigning different weights to the prompt embeddings, and editing the guidance scale. With callbacks, you can implement new features without modifying the underlying code!</p> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-1g9pkv">🤗 Diffusers currently only supports <code>callback_on_step_end</code>, but feel free to open a <a href="https://github.com/huggingface/diffusers/issues/new/choose" rel="nofollow">feature request</a> if you have a cool use-case and require a callback function with a different execution point!</p></div> <p data-svelte-h="svelte-1ct42i6">This guide will demonstrate how callbacks work by a few features you can implement with them.</p> <h2 class="relative group"><a id="official-callbacks" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#official-callbacks"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Official callbacks</span></h2> <p data-svelte-h="svelte-1b451ym">We provide a list of callbacks you can plug into an existing pipeline and modify the denoising loop. This is the current list of official callbacks:</p> <ul data-svelte-h="svelte-vd4zqd"><li><code>SDCFGCutoffCallback</code>: Disables the CFG after a certain number of steps for all SD 1.5 pipelines, including text-to-image, image-to-image, inpaint, and controlnet.</li> <li><code>SDXLCFGCutoffCallback</code>: Disables the CFG after a certain number of steps for all SDXL pipelines, including text-to-image, image-to-image, inpaint, and controlnet.</li> <li><code>IPAdapterScaleCutoffCallback</code>: Disables the IP Adapter after a certain number of steps for all pipelines supporting IP-Adapter.</li></ul> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-1q7jmra">If you want to add a new official callback, feel free to open a <a href="https://github.com/huggingface/diffusers/issues/new/choose" rel="nofollow">feature request</a> or <a href="https://huggingface.co/docs/diffusers/main/en/conceptual/contribution#how-to-open-a-pr" rel="nofollow">submit a PR</a>.</p></div> <p data-svelte-h="svelte-1y502l4">To set up a callback, you need to specify the number of denoising steps after which the callback comes into effect. You can do so by using either one of these two arguments</p> <ul data-svelte-h="svelte-3im7a1"><li><code>cutoff_step_ratio</code>: Float number with the ratio of the steps.</li> <li><code>cutoff_step_index</code>: Integer number with the exact number of the step.</li></ul> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> torch | |
| <span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> DPMSolverMultistepScheduler, StableDiffusionXLPipeline | |
| <span class="hljs-keyword">from</span> diffusers.callbacks <span class="hljs-keyword">import</span> SDXLCFGCutoffCallback | |
| callback = SDXLCFGCutoffCallback(cutoff_step_ratio=<span class="hljs-number">0.4</span>) | |
| <span class="hljs-comment"># can also be used with cutoff_step_index</span> | |
| <span class="hljs-comment"># callback = SDXLCFGCutoffCallback(cutoff_step_ratio=None, cutoff_step_index=10)</span> | |
| pipeline = StableDiffusionXLPipeline.from_pretrained( | |
| <span class="hljs-string">"stabilityai/stable-diffusion-xl-base-1.0"</span>, | |
| torch_dtype=torch.float16, | |
| variant=<span class="hljs-string">"fp16"</span>, | |
| ).to(<span class="hljs-string">"cuda"</span>) | |
| pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, use_karras_sigmas=<span class="hljs-literal">True</span>) | |
| prompt = <span class="hljs-string">"a sports car at the road, best quality, high quality, high detail, 8k resolution"</span> | |
| generator = torch.Generator(device=<span class="hljs-string">"cpu"</span>).manual_seed(<span class="hljs-number">2628670641</span>) | |
| out = pipeline( | |
| prompt=prompt, | |
| negative_prompt=<span class="hljs-string">""</span>, | |
| guidance_scale=<span class="hljs-number">6.5</span>, | |
| num_inference_steps=<span class="hljs-number">25</span>, | |
| generator=generator, | |
| callback_on_step_end=callback, | |
| ) | |
| out.images[<span class="hljs-number">0</span>].save(<span class="hljs-string">"official_callback.png"</span>)<!-- HTML_TAG_END --></pre></div> <div class="flex gap-4" data-svelte-h="svelte-17tud58"><div><img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/without_cfg_callback.png" alt="generated image of a sports car at the road"> <figcaption class="mt-2 text-center text-sm text-gray-500">without SDXLCFGCutoffCallback</figcaption></div> <div><img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/with_cfg_callback.png" alt="generated image of a sports car at the road with cfg callback"> <figcaption class="mt-2 text-center text-sm text-gray-500">with SDXLCFGCutoffCallback</figcaption></div></div> <h2 class="relative group"><a id="dynamic-classifier-free-guidance" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#dynamic-classifier-free-guidance"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Dynamic classifier-free guidance</span></h2> <p data-svelte-h="svelte-1f6wgfp">Dynamic classifier-free guidance (CFG) is a feature that allows you to disable CFG after a certain number of inference steps which can help you save compute with minimal cost to performance. The callback function for this should have the following arguments:</p> <ul data-svelte-h="svelte-dtd0ri"><li><code>pipeline</code> (or the pipeline instance) provides access to important properties such as <code>num_timesteps</code> and <code>guidance_scale</code>. You can modify these properties by updating the underlying attributes. For this example, you’ll disable CFG by setting <code>pipeline._guidance_scale=0.0</code>.</li> <li><code>step_index</code> and <code>timestep</code> tell you where you are in the denoising loop. Use <code>step_index</code> to turn off CFG after reaching 40% of <code>num_timesteps</code>.</li> <li><code>callback_kwargs</code> is a dict that contains tensor variables you can modify during the denoising loop. It only includes variables specified in the <code>callback_on_step_end_tensor_inputs</code> argument, which is passed to the pipeline’s <code>__call__</code> method. Different pipelines may use different sets of variables, so please check a pipeline’s <code>_callback_tensor_inputs</code> attribute for the list of variables you can modify. Some common variables include <code>latents</code> and <code>prompt_embeds</code>. For this function, change the batch size of <code>prompt_embeds</code> after setting <code>guidance_scale=0.0</code> in order for it to work properly.</li></ul> <p data-svelte-h="svelte-1s2b9st">Your callback function should look something like this:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">def</span> <span class="hljs-title function_">callback_dynamic_cfg</span>(<span class="hljs-params">pipe, step_index, timestep, callback_kwargs</span>): | |
| <span class="hljs-comment"># adjust the batch_size of prompt_embeds according to guidance_scale</span> | |
| <span class="hljs-keyword">if</span> step_index == <span class="hljs-built_in">int</span>(pipeline.num_timesteps * <span class="hljs-number">0.4</span>): | |
| prompt_embeds = callback_kwargs[<span class="hljs-string">"prompt_embeds"</span>] | |
| prompt_embeds = prompt_embeds.chunk(<span class="hljs-number">2</span>)[-<span class="hljs-number">1</span>] | |
| <span class="hljs-comment"># update guidance_scale and prompt_embeds</span> | |
| pipeline._guidance_scale = <span class="hljs-number">0.0</span> | |
| callback_kwargs[<span class="hljs-string">"prompt_embeds"</span>] = prompt_embeds | |
| <span class="hljs-keyword">return</span> callback_kwargs<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-y3yxaw">Now, you can pass the callback function to the <code>callback_on_step_end</code> parameter and the <code>prompt_embeds</code> to <code>callback_on_step_end_tensor_inputs</code>.</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> torch | |
| <span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> StableDiffusionPipeline | |
| pipeline = StableDiffusionPipeline.from_pretrained(<span class="hljs-string">"stable-diffusion-v1-5/stable-diffusion-v1-5"</span>, torch_dtype=torch.float16) | |
| pipeline = pipeline.to(<span class="hljs-string">"cuda"</span>) | |
| prompt = <span class="hljs-string">"a photo of an astronaut riding a horse on mars"</span> | |
| generator = torch.Generator(device=<span class="hljs-string">"cuda"</span>).manual_seed(<span class="hljs-number">1</span>) | |
| out = pipeline( | |
| prompt, | |
| generator=generator, | |
| callback_on_step_end=callback_dynamic_cfg, | |
| callback_on_step_end_tensor_inputs=[<span class="hljs-string">'prompt_embeds'</span>] | |
| ) | |
| out.images[<span class="hljs-number">0</span>].save(<span class="hljs-string">"out_custom_cfg.png"</span>)<!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="interrupt-the-diffusion-process" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#interrupt-the-diffusion-process"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Interrupt the diffusion process</span></h2> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-q9hytq">The interruption callback is supported for text-to-image, image-to-image, and inpainting for the <a href="../api/pipelines/stable_diffusion/overview">StableDiffusionPipeline</a> and <a href="../api/pipelines/stable_diffusion/stable_diffusion_xl">StableDiffusionXLPipeline</a>.</p></div> <p data-svelte-h="svelte-1nej5ag">Stopping the diffusion process early is useful when building UIs that work with Diffusers because it allows users to stop the generation process if they’re unhappy with the intermediate results. You can incorporate this into your pipeline with a callback.</p> <p data-svelte-h="svelte-zdg2uc">This callback function should take the following arguments: <code>pipeline</code>, <code>i</code>, <code>t</code>, and <code>callback_kwargs</code> (this must be returned). Set the pipeline’s <code>_interrupt</code> attribute to <code>True</code> to stop the diffusion process after a certain number of steps. You are also free to implement your own custom stopping logic inside the callback.</p> <p data-svelte-h="svelte-s21qgj">In this example, the diffusion process is stopped after 10 steps even though <code>num_inference_steps</code> is set to 50.</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> StableDiffusionPipeline | |
| pipeline = StableDiffusionPipeline.from_pretrained(<span class="hljs-string">"stable-diffusion-v1-5/stable-diffusion-v1-5"</span>) | |
| pipeline.enable_model_cpu_offload() | |
| num_inference_steps = <span class="hljs-number">50</span> | |
| <span class="hljs-keyword">def</span> <span class="hljs-title function_">interrupt_callback</span>(<span class="hljs-params">pipeline, i, t, callback_kwargs</span>): | |
| stop_idx = <span class="hljs-number">10</span> | |
| <span class="hljs-keyword">if</span> i == stop_idx: | |
| pipeline._interrupt = <span class="hljs-literal">True</span> | |
| <span class="hljs-keyword">return</span> callback_kwargs | |
| pipeline( | |
| <span class="hljs-string">"A photo of a cat"</span>, | |
| num_inference_steps=num_inference_steps, | |
| callback_on_step_end=interrupt_callback, | |
| )<!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="ip-adapter-cutoff" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#ip-adapter-cutoff"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>IP Adapter Cutoff</span></h2> <p data-svelte-h="svelte-1xau8un">IP Adapter is an image prompt adapter that can be used for diffusion models without any changes to the underlying model. We can use the IP Adapter Cutoff Callback to disable the IP Adapter after a certain number of steps. To set up the callback, you need to specify the number of denoising steps after which the callback comes into effect. You can do so by using either one of these two arguments:</p> <ul data-svelte-h="svelte-3im7a1"><li><code>cutoff_step_ratio</code>: Float number with the ratio of the steps.</li> <li><code>cutoff_step_index</code>: Integer number with the exact number of the step.</li></ul> <p data-svelte-h="svelte-1f2qpl3">We need to download the diffusion model and load the ip_adapter for it as follows:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> AutoPipelineForText2Image | |
| <span class="hljs-keyword">from</span> diffusers.utils <span class="hljs-keyword">import</span> load_image | |
| <span class="hljs-keyword">import</span> torch | |
| pipeline = AutoPipelineForText2Image.from_pretrained(<span class="hljs-string">"stabilityai/stable-diffusion-xl-base-1.0"</span>, torch_dtype=torch.float16).to(<span class="hljs-string">"cuda"</span>) | |
| pipeline.load_ip_adapter(<span class="hljs-string">"h94/IP-Adapter"</span>, subfolder=<span class="hljs-string">"sdxl_models"</span>, weight_name=<span class="hljs-string">"ip-adapter_sdxl.bin"</span>) | |
| pipeline.set_ip_adapter_scale(<span class="hljs-number">0.6</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-bjmgr4">The setup for the callback should look something like this:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --> | |
| <span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> AutoPipelineForText2Image | |
| <span class="hljs-keyword">from</span> diffusers.callbacks <span class="hljs-keyword">import</span> IPAdapterScaleCutoffCallback | |
| <span class="hljs-keyword">from</span> diffusers.utils <span class="hljs-keyword">import</span> load_image | |
| <span class="hljs-keyword">import</span> torch | |
| pipeline = AutoPipelineForText2Image.from_pretrained( | |
| <span class="hljs-string">"stabilityai/stable-diffusion-xl-base-1.0"</span>, | |
| torch_dtype=torch.float16 | |
| ).to(<span class="hljs-string">"cuda"</span>) | |
| pipeline.load_ip_adapter( | |
| <span class="hljs-string">"h94/IP-Adapter"</span>, | |
| subfolder=<span class="hljs-string">"sdxl_models"</span>, | |
| weight_name=<span class="hljs-string">"ip-adapter_sdxl.bin"</span> | |
| ) | |
| pipeline.set_ip_adapter_scale(<span class="hljs-number">0.6</span>) | |
| callback = IPAdapterScaleCutoffCallback( | |
| cutoff_step_ratio=<span class="hljs-literal">None</span>, | |
| cutoff_step_index=<span class="hljs-number">5</span> | |
| ) | |
| image = load_image( | |
| <span class="hljs-string">"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/ip_adapter_diner.png"</span> | |
| ) | |
| generator = torch.Generator(device=<span class="hljs-string">"cuda"</span>).manual_seed(<span class="hljs-number">2628670641</span>) | |
| images = pipeline( | |
| prompt=<span class="hljs-string">"a tiger sitting in a chair drinking orange juice"</span>, | |
| ip_adapter_image=image, | |
| negative_prompt=<span class="hljs-string">"deformed, ugly, wrong proportion, low res, bad anatomy, worst quality, low quality"</span>, | |
| generator=generator, | |
| num_inference_steps=<span class="hljs-number">50</span>, | |
| callback_on_step_end=callback, | |
| ).images | |
| images[<span class="hljs-number">0</span>].save(<span class="hljs-string">"custom_callback_img.png"</span>)<!-- HTML_TAG_END --></pre></div> <div class="flex gap-4" data-svelte-h="svelte-1qb4eti"><div><img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/without_callback.png" alt="generated image of a tiger sitting in a chair drinking orange juice"> <figcaption class="mt-2 text-center text-sm text-gray-500">without IPAdapterScaleCutoffCallback</figcaption></div> <div><img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/with_callback2.png" alt="generated image of a tiger sitting in a chair drinking orange juice with ip adapter callback"> <figcaption class="mt-2 text-center text-sm text-gray-500">with IPAdapterScaleCutoffCallback</figcaption></div></div> <h2 class="relative group"><a id="display-image-after-each-generation-step" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#display-image-after-each-generation-step"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Display image after each generation step</span></h2> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-13sfhtz">This tip was contributed by <a href="https://github.com/asomoza" rel="nofollow">asomoza</a>.</p></div> <p data-svelte-h="svelte-1ofpl9u">Display an image after each generation step by accessing and converting the latents after each step into an image. The latent space is compressed to 128x128, so the images are also 128x128 which is useful for a quick preview.</p> <ol data-svelte-h="svelte-19fyqau"><li>Use the function below to convert the SDXL latents (4 channels) to RGB tensors (3 channels) as explained in the <a href="https://huggingface.co/blog/TimothyAlexisVass/explaining-the-sdxl-latent-space" rel="nofollow">Explaining the SDXL latent space</a> blog post.</li></ol> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">def</span> <span class="hljs-title function_">latents_to_rgb</span>(<span class="hljs-params">latents</span>): | |
| weights = ( | |
| (<span class="hljs-number">60</span>, -<span class="hljs-number">60</span>, <span class="hljs-number">25</span>, -<span class="hljs-number">70</span>), | |
| (<span class="hljs-number">60</span>, -<span class="hljs-number">5</span>, <span class="hljs-number">15</span>, -<span class="hljs-number">50</span>), | |
| (<span class="hljs-number">60</span>, <span class="hljs-number">10</span>, -<span class="hljs-number">5</span>, -<span class="hljs-number">35</span>), | |
| ) | |
| weights_tensor = torch.t(torch.tensor(weights, dtype=latents.dtype).to(latents.device)) | |
| biases_tensor = torch.tensor((<span class="hljs-number">150</span>, <span class="hljs-number">140</span>, <span class="hljs-number">130</span>), dtype=latents.dtype).to(latents.device) | |
| rgb_tensor = torch.einsum(<span class="hljs-string">"...lxy,lr -> ...rxy"</span>, latents, weights_tensor) + biases_tensor.unsqueeze(-<span class="hljs-number">1</span>).unsqueeze(-<span class="hljs-number">1</span>) | |
| image_array = rgb_tensor.clamp(<span class="hljs-number">0</span>, <span class="hljs-number">255</span>).byte().cpu().numpy().transpose(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>) | |
| <span class="hljs-keyword">return</span> Image.fromarray(image_array)<!-- HTML_TAG_END --></pre></div> <ol start="2" data-svelte-h="svelte-ebwxn0"><li>Create a function to decode and save the latents into an image.</li></ol> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">def</span> <span class="hljs-title function_">decode_tensors</span>(<span class="hljs-params">pipe, step, timestep, callback_kwargs</span>): | |
| latents = callback_kwargs[<span class="hljs-string">"latents"</span>] | |
| image = latents_to_rgb(latents[<span class="hljs-number">0</span>]) | |
| image.save(<span class="hljs-string">f"<span class="hljs-subst">{step}</span>.png"</span>) | |
| <span class="hljs-keyword">return</span> callback_kwargs<!-- HTML_TAG_END --></pre></div> <ol start="3" data-svelte-h="svelte-1r24q0u"><li>Pass the <code>decode_tensors</code> function to the <code>callback_on_step_end</code> parameter to decode the tensors after each step. You also need to specify what you want to modify in the <code>callback_on_step_end_tensor_inputs</code> parameter, which in this case are the latents.</li></ol> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> AutoPipelineForText2Image | |
| <span class="hljs-keyword">import</span> torch | |
| <span class="hljs-keyword">from</span> PIL <span class="hljs-keyword">import</span> Image | |
| pipeline = AutoPipelineForText2Image.from_pretrained( | |
| <span class="hljs-string">"stabilityai/stable-diffusion-xl-base-1.0"</span>, | |
| torch_dtype=torch.float16, | |
| variant=<span class="hljs-string">"fp16"</span>, | |
| use_safetensors=<span class="hljs-literal">True</span> | |
| ).to(<span class="hljs-string">"cuda"</span>) | |
| image = pipeline( | |
| prompt=<span class="hljs-string">"A croissant shaped like a cute bear."</span>, | |
| negative_prompt=<span class="hljs-string">"Deformed, ugly, bad anatomy"</span>, | |
| callback_on_step_end=decode_tensors, | |
| callback_on_step_end_tensor_inputs=[<span class="hljs-string">"latents"</span>], | |
| ).images[<span class="hljs-number">0</span>]<!-- HTML_TAG_END --></pre></div> <div class="flex gap-4 justify-center" data-svelte-h="svelte-o2x0os"><div><img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/tips_step_0.png"> <figcaption class="mt-2 text-center text-sm text-gray-500">step 0</figcaption></div> <div><img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/tips_step_19.png"> <figcaption class="mt-2 text-center text-sm text-gray-500">step 19</figcaption></div> <div><img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/tips_step_29.png"> <figcaption class="mt-2 text-center text-sm text-gray-500">step 29</figcaption></div> <div><img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/tips_step_39.png"> <figcaption class="mt-2 text-center text-sm text-gray-500">step 39</figcaption></div> <div><img class="rounded-xl" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/tips_step_49.png"> <figcaption class="mt-2 text-center text-sm text-gray-500">step 49</figcaption></div></div> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/diffusers/blob/main/docs/source/en/using-diffusers/callback.md" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_u238jf = { | |
| assets: "/docs/diffusers/pr_11986/en", | |
| base: "/docs/diffusers/pr_11986/en", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/diffusers/pr_11986/en/_app/immutable/entry/start.e4c568fa.js"), | |
| import("/docs/diffusers/pr_11986/en/_app/immutable/entry/app.29bdf47f.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 280], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 44.1 kB
- Xet hash:
- 3b02738c114e23883df6a4cf9964730e989a0006cc6b538f5b6bb33d73e702ad
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.