Buckets:

hf-doc-build
/

doc-dev

hf-doc-build/doc-dev / diffusers /pr_11234 /en /tutorials /inference_with_big_models.html

25.1 kB

	<meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"Working with big models","local":"working-with-big-models","sections":[{"title":"Device placement","local":"device-placement","sections":[],"depth":2}],"depth":1}">
	<link href="/docs/diffusers/pr_11234/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/entry/start.5a617280.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/chunks/scheduler.8c3d61f6.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/chunks/singletons.0c831ec7.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/chunks/index.0997d446.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/chunks/paths.c606bb8d.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/entry/app.1a1de8b6.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/chunks/index.da70eac4.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/nodes/0.8544c871.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/chunks/each.e59479a4.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/nodes/254.eec340bf.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/chunks/Tip.1d9b8c37.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/chunks/CodeBlock.a9c4becf.js">
	<link rel="modulepreload" href="/docs/diffusers/pr_11234/en/_app/immutable/chunks/index.5d4ab994.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"Working with big models","local":"working-with-big-models","sections":[{"title":"Device placement","local":"device-placement","sections":[],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h1 class="relative group"><a id="working-with-big-models" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#working-with-big-models"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Working with big models</span></h1> <p data-svelte-h="svelte-i5ki63">A modern diffusion model, like <a href="../using-diffusers/sdxl">Stable Diffusion XL (SDXL)</a>, is not just a single model, but a collection of multiple models. SDXL has four different model-level components:</p> <ul data-svelte-h="svelte-di8t2m"><li>A variational autoencoder (VAE)</li> <li>Two text encoders</li> <li>A UNet for denoising</li></ul> <p data-svelte-h="svelte-141shqw">Usually, the text encoders and the denoiser are much larger compared to the VAE.</p> <p data-svelte-h="svelte-n27w9x">As models get bigger and better, it’s possible your model is so big that even a single copy won’t fit in memory. But that doesn’t mean it can’t be loaded. If you have more than one GPU, there is more memory available to store your model. In this case, it’s better to split your model checkpoint into several smaller <em>checkpoint shards</em>.</p> <p data-svelte-h="svelte-1cxyqis">When a text encoder checkpoint has multiple shards, like <a href="https://huggingface.co/stabilityai/stable-diffusion-3-medium-diffusers/tree/main/text_encoder_3" rel="nofollow">T5-xxl for SD3</a>, it is automatically handled by the <a href="https://huggingface.co/docs/transformers/index" rel="nofollow">Transformers</a> library as it is a required dependency of Diffusers when using the <a href="/docs/diffusers/pr_11234/en/api/pipelines/stable_diffusion/stable_diffusion_3#diffusers.StableDiffusion3Pipeline">StableDiffusion3Pipeline</a>. More specifically, Transformers will automatically handle the loading of multiple shards within the requested model class and get it ready so that inference can be performed.</p> <p data-svelte-h="svelte-igij6y">The denoiser checkpoint can also have multiple shards and supports inference thanks to the <a href="https://huggingface.co/docs/accelerate/index" rel="nofollow">Accelerate</a> library.</p> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-6g79u0">Refer to the <a href="https://huggingface.co/docs/accelerate/main/en/concept_guides/big_model_inference" rel="nofollow">Handling big models for inference</a> guide for general guidance when working with big models that are hard to fit into memory.</p></div> <p data-svelte-h="svelte-bu4e6r">For example, let’s save a sharded checkpoint for the <a href="https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main/unet" rel="nofollow">SDXL UNet</a>:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> UNet2DConditionModel

	unet = UNet2DConditionModel.from_pretrained(
	<span class="hljs-string">"stabilityai/stable-diffusion-xl-base-1.0"</span>, subfolder=<span class="hljs-string">"unet"</span>
	)
	unet.save_pretrained(<span class="hljs-string">"sdxl-unet-sharded"</span>, max_shard_size=<span class="hljs-string">"5GB"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-m4u5h9">The size of the fp32 variant of the SDXL UNet checkpoint is ~10.4GB. Set the <code>max_shard_size</code> parameter to 5GB to create 3 shards. After saving, you can load them in <a href="/docs/diffusers/pr_11234/en/api/pipelines/stable_diffusion/stable_diffusion_xl#diffusers.StableDiffusionXLPipeline">StableDiffusionXLPipeline</a>:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> UNet2DConditionModel, StableDiffusionXLPipeline
	<span class="hljs-keyword">import</span> torch

	unet = UNet2DConditionModel.from_pretrained(
	<span class="hljs-string">"sayakpaul/sdxl-unet-sharded"</span>, torch_dtype=torch.float16
	)
	pipeline = StableDiffusionXLPipeline.from_pretrained(
	<span class="hljs-string">"stabilityai/stable-diffusion-xl-base-1.0"</span>, unet=unet, torch_dtype=torch.float16
	).to(<span class="hljs-string">"cuda"</span>)

	image = pipeline(<span class="hljs-string">"a cute dog running on the grass"</span>, num_inference_steps=<span class="hljs-number">30</span>).images[<span class="hljs-number">0</span>]
	image.save(<span class="hljs-string">"dog.png"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-17ggtvi">If placing all the model-level components on the GPU at once is not feasible, use <a href="/docs/diffusers/pr_11234/en/api/pipelines/overview#diffusers.DiffusionPipeline.enable_model_cpu_offload">enable_model_cpu_offload()</a> to help you:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-deletion">- pipeline.to("cuda")</span>
	<span class="hljs-addition">+ pipeline.enable_model_cpu_offload()</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-110kwsj">In general, we recommend sharding when a checkpoint is more than 5GB (in fp32).</p> <h2 class="relative group"><a id="device-placement" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#device-placement"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Device placement</span></h2> <p data-svelte-h="svelte-17ssrfq">On distributed setups, you can run inference across multiple GPUs with Accelerate.</p> <div class="course-tip course-tip-orange bg-gradient-to-br dark:bg-gradient-to-r before:border-orange-500 dark:before:border-orange-800 from-orange-50 dark:from-gray-900 to-white dark:to-gray-950 border border-orange-50 text-orange-700 dark:text-gray-400"><p data-svelte-h="svelte-jtjc2m">This feature is experimental and its APIs might change in the future.</p></div> <p data-svelte-h="svelte-1y4i1fz">With Accelerate, you can use the <code>device_map</code> to determine how to distribute the models of a pipeline across multiple devices. This is useful in situations where you have more than one GPU.</p> <p data-svelte-h="svelte-10bmn1m">For example, if you have two 8GB GPUs, then using <a href="/docs/diffusers/pr_11234/en/api/pipelines/overview#diffusers.DiffusionPipeline.enable_model_cpu_offload">enable_model_cpu_offload()</a> may not work so well because:</p> <ul data-svelte-h="svelte-eei53u"><li>it only works on a single GPU</li> <li>a single model might not fit on a single GPU (<a href="/docs/diffusers/pr_11234/en/api/pipelines/overview#diffusers.DiffusionPipeline.enable_sequential_cpu_offload">enable_sequential_cpu_offload()</a> might work but it will be extremely slow and it is also limited to a single GPU)</li></ul> <p data-svelte-h="svelte-awezto">To make use of both GPUs, you can use the “balanced” device placement strategy which splits the models across all available GPUs.</p> <div class="course-tip course-tip-orange bg-gradient-to-br dark:bg-gradient-to-r before:border-orange-500 dark:before:border-orange-800 from-orange-50 dark:from-gray-900 to-white dark:to-gray-950 border border-orange-50 text-orange-700 dark:text-gray-400"><p data-svelte-h="svelte-qgg1n4">Only the “balanced” strategy is supported at the moment, and we plan to support additional mapping strategies in the future.</p></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->from diffusers import DiffusionPipeline
	import torch

	pipeline = DiffusionPipeline.from_pretrained(
	<span class="hljs-deletion">- "stable-diffusion-v1-5/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True,</span>
	<span class="hljs-addition">+ "stable-diffusion-v1-5/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True, device_map="balanced"</span>
	)
	image = pipeline("a dog").images[0]
	image<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1y8b6l8">You can also pass a dictionary to enforce the maximum GPU memory that can be used on each device:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->from diffusers import DiffusionPipeline
	import torch

	max_memory = {0:"1GB", 1:"1GB"}
	pipeline = DiffusionPipeline.from_pretrained(
	"stable-diffusion-v1-5/stable-diffusion-v1-5",
	torch_dtype=torch.float16,
	use_safetensors=True,
	device_map="balanced",
	<span class="hljs-addition">+ max_memory=max_memory</span>
	)
	image = pipeline("a dog").images[0]
	image<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-fmsbbs">If a device is not present in <code>max_memory</code>, then it will be completely ignored and will not participate in the device placement.</p> <p data-svelte-h="svelte-1f2rh7k">By default, Diffusers uses the maximum memory of all devices. If the models don’t fit on the GPUs, they are offloaded to the CPU. If the CPU doesn’t have enough memory, then you might see an error. In that case, you could defer to using <a href="/docs/diffusers/pr_11234/en/api/pipelines/overview#diffusers.DiffusionPipeline.enable_sequential_cpu_offload">enable_sequential_cpu_offload()</a> and <a href="/docs/diffusers/pr_11234/en/api/pipelines/overview#diffusers.DiffusionPipeline.enable_model_cpu_offload">enable_model_cpu_offload()</a>.</p> <p data-svelte-h="svelte-1zxb45">Call <a href="/docs/diffusers/pr_11234/en/api/pipelines/overview#diffusers.DiffusionPipeline.reset_device_map">reset_device_map()</a> to reset the <code>device_map</code> of a pipeline. This is also necessary if you want to use methods like <code>to()</code>, <a href="/docs/diffusers/pr_11234/en/api/pipelines/overview#diffusers.DiffusionPipeline.enable_sequential_cpu_offload">enable_sequential_cpu_offload()</a>, and <a href="/docs/diffusers/pr_11234/en/api/pipelines/overview#diffusers.DiffusionPipeline.enable_model_cpu_offload">enable_model_cpu_offload()</a> on a pipeline that was device-mapped.</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->pipeline.reset_device_map()<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-17pbny5">Once a pipeline has been device-mapped, you can also access its device map via <code>hf_device_map</code>:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-built_in">print</span>(pipeline.hf_device_map)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-18o6tiw">An example device map would look like so:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->{<span class="hljs-string">'unet'</span>: 1, <span class="hljs-string">'vae'</span>: 1, <span class="hljs-string">'safety_checker'</span>: 0, <span class="hljs-string">'text_encoder'</span>: 0}<!-- HTML_TAG_END --></pre></div> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/diffusers/blob/main/docs/source/en/tutorials/inference_with_big_models.md" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p>

	<script>
	{
	__sveltekit_225bzv = {
	assets: "/docs/diffusers/pr_11234/en",
	base: "/docs/diffusers/pr_11234/en",
	env: {}
	};

	const element = document.currentScript.parentElement;

	const data = [null,null];

	Promise.all([
	import("/docs/diffusers/pr_11234/en/_app/immutable/entry/start.5a617280.js"),
	import("/docs/diffusers/pr_11234/en/_app/immutable/entry/app.1a1de8b6.js")
	]).then(([kit, app]) => {
	kit.start(app, element, {
	node_ids: [0, 254],
	data,
	form: null,
	error: null
	});
	});
	}
	</script>

Xet Storage Details

Size:: 25.1 kB
Xet hash:: db2d1453e86bbd746557f5fc245879b7739c6e80332c549dea62d88941c1eb50

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.