Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"커뮤니티 파이프라인","local":"커뮤니티-파이프라인","sections":[{"title":"사용 예시","local":"사용-예시","sections":[{"title":"CLIP 가이드 기반의 Stable Diffusion","local":"clip-가이드-기반의-stable-diffusion","sections":[],"depth":3},{"title":"One Step Unet","local":"one-step-unet","sections":[],"depth":3},{"title":"Stable Diffusion Interpolation","local":"stable-diffusion-interpolation","sections":[],"depth":3},{"title":"Stable Diffusion Mega","local":"stable-diffusion-mega","sections":[],"depth":3},{"title":"Long Prompt Weighting Stable Diffusion","local":"long-prompt-weighting-stable-diffusion","sections":[{"title":"pytorch","local":"pytorch","sections":[],"depth":4},{"title":"onnxruntime","local":"onnxruntime","sections":[],"depth":4}],"depth":3},{"title":"Speech to Image","local":"speech-to-image","sections":[],"depth":3}],"depth":2}],"depth":1}"> | |
| <link href="/docs/diffusers/main/ko/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/entry/start.0574fe93.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/chunks/scheduler.e3739aa0.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/chunks/singletons.6c595ea8.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/chunks/index.02f4400c.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/chunks/paths.b2b36d06.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/entry/app.f5608ddb.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/chunks/index.13f5b837.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/nodes/0.a357fdeb.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/nodes/40.13c2c47b.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/chunks/CodeBlock.de02009a.js"> | |
| <link rel="modulepreload" href="/docs/diffusers/main/ko/_app/immutable/chunks/EditOnGithub.72bac8d8.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"커뮤니티 파이프라인","local":"커뮤니티-파이프라인","sections":[{"title":"사용 예시","local":"사용-예시","sections":[{"title":"CLIP 가이드 기반의 Stable Diffusion","local":"clip-가이드-기반의-stable-diffusion","sections":[],"depth":3},{"title":"One Step Unet","local":"one-step-unet","sections":[],"depth":3},{"title":"Stable Diffusion Interpolation","local":"stable-diffusion-interpolation","sections":[],"depth":3},{"title":"Stable Diffusion Mega","local":"stable-diffusion-mega","sections":[],"depth":3},{"title":"Long Prompt Weighting Stable Diffusion","local":"long-prompt-weighting-stable-diffusion","sections":[{"title":"pytorch","local":"pytorch","sections":[],"depth":4},{"title":"onnxruntime","local":"onnxruntime","sections":[],"depth":4}],"depth":3},{"title":"Speech to Image","local":"speech-to-image","sections":[],"depth":3}],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h1 class="relative group"><a id="커뮤니티-파이프라인" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#커뮤니티-파이프라인"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>커뮤니티 파이프라인</span></h1> <blockquote data-svelte-h="svelte-19ufsz0"><p>**커뮤니티 파이프라인에 대한 자세한 내용은 <a href="https://github.com/huggingface/diffusers/issues/841" rel="nofollow">이 이슈</a>를 참조하세요.</p></blockquote> <p data-svelte-h="svelte-bnjena"><strong>커뮤니티</strong> 예제는 커뮤니티에서 추가한 추론 및 훈련 예제로 구성되어 있습니다. | |
| 다음 표를 참조하여 모든 커뮤니티 예제에 대한 개요를 확인하시기 바랍니다. <strong>코드 예제</strong>를 클릭하면 복사하여 붙여넣기할 수 있는 코드 예제를 확인할 수 있습니다. | |
| 커뮤니티가 예상대로 작동하지 않는 경우 이슈를 개설하고 작성자에게 핑을 보내주세요.</p> <table data-svelte-h="svelte-1vl6gny"><thead><tr><th align="left">예</th> <th align="left">설명</th> <th align="left">코드 예제</th> <th align="left">콜랩</th> <th align="right">저자</th></tr></thead> <tbody><tr><td align="left">CLIP Guided Stable Diffusion</td> <td align="left">CLIP 가이드 기반의 Stable Diffusion으로 텍스트에서 이미지로 생성하기</td> <td align="left"><a href="#clip-guided-stable-diffusion">CLIP Guided Stable Diffusion</a></td> <td align="left"><a href="https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/CLIP_Guided_Stable_diffusion_with_diffusers.ipynb" rel="nofollow"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="콜랩에서 열기"></a></td> <td align="right"><a href="https://github.com/patil-suraj/" rel="nofollow">Suraj Patil</a></td></tr> <tr><td align="left">One Step U-Net (Dummy)</td> <td align="left">커뮤니티 파이프라인을 어떻게 사용해야 하는지에 대한 예시(참고 <a href="https://github.com/huggingface/diffusers/issues/841" rel="nofollow">https://github.com/huggingface/diffusers/issues/841</a>)</td> <td align="left"><a href="#one-step-unet">One Step U-Net</a></td> <td align="left">-</td> <td align="right"><a href="https://github.com/patrickvonplaten/" rel="nofollow">Patrick von Platen</a></td></tr> <tr><td align="left">Stable Diffusion Interpolation</td> <td align="left">서로 다른 프롬프트/시드 간 Stable Diffusion의 latent space 보간</td> <td align="left"><a href="#stable-diffusion-interpolation">Stable Diffusion Interpolation</a></td> <td align="left">-</td> <td align="right"><a href="https://github.com/nateraw/" rel="nofollow">Nate Raw</a></td></tr> <tr><td align="left">Stable Diffusion Mega</td> <td align="left">모든 기능을 갖춘 <strong>하나의</strong> Stable Diffusion 파이프라인 <a href="https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py" rel="nofollow">Text2Image</a>, <a href="https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py" rel="nofollow">Image2Image</a> and <a href="https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py" rel="nofollow">Inpainting</a></td> <td align="left"><a href="#stable-diffusion-mega">Stable Diffusion Mega</a></td> <td align="left">-</td> <td align="right"><a href="https://github.com/patrickvonplaten/" rel="nofollow">Patrick von Platen</a></td></tr> <tr><td align="left">Long Prompt Weighting Stable Diffusion</td> <td align="left">토큰 길이 제한이 없고 프롬프트에서 파싱 가중치 지원을 하는 <strong>하나의</strong> Stable Diffusion 파이프라인,</td> <td align="left"><a href="#long-prompt-weighting-stable-diffusion">Long Prompt Weighting Stable Diffusion</a></td> <td align="left">-</td> <td align="right"><a href="https://github.com/SkyTNT" rel="nofollow">SkyTNT</a></td></tr> <tr><td align="left">Speech to Image</td> <td align="left">자동 음성 인식을 사용하여 텍스트를 작성하고 Stable Diffusion을 사용하여 이미지를 생성합니다.</td> <td align="left"><a href="#speech-to-image">Speech to Image</a></td> <td align="left">-</td> <td align="right"><a href="https://github.com/MikailINTech" rel="nofollow">Mikail Duzenli</a></td></tr></tbody></table> <p data-svelte-h="svelte-1gzwrqq">커스텀 파이프라인을 불러오려면 <code>diffusers/examples/community</code>에 있는 파일 중 하나로서 <code>custom_pipeline</code> 인수를 <code>DiffusionPipeline</code>에 전달하기만 하면 됩니다. 자신만의 파이프라인이 있는 PR을 보내주시면 빠르게 병합해드리겠습니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->pipe = DiffusionPipeline.from_pretrained( | |
| <span class="hljs-string">"CompVis/stable-diffusion-v1-4"</span>, custom_pipeline=<span class="hljs-string">"filename_in_the_community_folder"</span> | |
| )<!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="사용-예시" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#사용-예시"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>사용 예시</span></h2> <h3 class="relative group"><a id="clip-가이드-기반의-stable-diffusion" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#clip-가이드-기반의-stable-diffusion"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>CLIP 가이드 기반의 Stable Diffusion</span></h3> <p data-svelte-h="svelte-1cjq4kr">모든 노이즈 제거 단계에서 추가 CLIP 모델을 통해 Stable Diffusion을 가이드함으로써 CLIP 모델 기반의 Stable Diffusion은 보다 더 사실적인 이미지를 생성을 할 수 있습니다.</p> <p data-svelte-h="svelte-h9eqwo">다음 코드는 약 12GB의 GPU RAM이 필요합니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> DiffusionPipeline | |
| <span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> CLIPImageProcessor, CLIPModel | |
| <span class="hljs-keyword">import</span> torch | |
| feature_extractor = CLIPImageProcessor.from_pretrained(<span class="hljs-string">"laion/CLIP-ViT-B-32-laion2B-s34B-b79K"</span>) | |
| clip_model = CLIPModel.from_pretrained(<span class="hljs-string">"laion/CLIP-ViT-B-32-laion2B-s34B-b79K"</span>, torch_dtype=torch.float16) | |
| guided_pipeline = DiffusionPipeline.from_pretrained( | |
| <span class="hljs-string">"CompVis/stable-diffusion-v1-4"</span>, | |
| custom_pipeline=<span class="hljs-string">"clip_guided_stable_diffusion"</span>, | |
| clip_model=clip_model, | |
| feature_extractor=feature_extractor, | |
| torch_dtype=torch.float16, | |
| ) | |
| guided_pipeline.enable_attention_slicing() | |
| guided_pipeline = guided_pipeline.to(<span class="hljs-string">"cuda"</span>) | |
| prompt = <span class="hljs-string">"fantasy book cover, full moon, fantasy forest landscape, golden vector elements, fantasy magic, dark light night, intricate, elegant, sharp focus, illustration, highly detailed, digital painting, concept art, matte, art by WLOP and Artgerm and Albert Bierstadt, masterpiece"</span> | |
| generator = torch.Generator(device=<span class="hljs-string">"cuda"</span>).manual_seed(<span class="hljs-number">0</span>) | |
| images = [] | |
| <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-number">4</span>): | |
| image = guided_pipeline( | |
| prompt, | |
| num_inference_steps=<span class="hljs-number">50</span>, | |
| guidance_scale=<span class="hljs-number">7.5</span>, | |
| clip_guidance_scale=<span class="hljs-number">100</span>, | |
| num_cutouts=<span class="hljs-number">4</span>, | |
| use_cutouts=<span class="hljs-literal">False</span>, | |
| generator=generator, | |
| ).images[<span class="hljs-number">0</span>] | |
| images.append(image) | |
| <span class="hljs-comment"># 이미지 로컬에 저장하기</span> | |
| <span class="hljs-keyword">for</span> i, img <span class="hljs-keyword">in</span> <span class="hljs-built_in">enumerate</span>(images): | |
| img.save(<span class="hljs-string">f"./clip_guided_sd/image_<span class="hljs-subst">{i}</span>.png"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-rejxd2">이미지` 목록에는 로컬에 저장하거나 구글 콜랩에 직접 표시할 수 있는 PIL 이미지 목록이 포함되어 있습니다. 생성된 이미지는 기본적으로 안정적인 확산을 사용하는 것보다 품질이 높은 경향이 있습니다. 예를 들어 위의 스크립트는 다음과 같은 이미지를 생성합니다:</p> <p data-svelte-h="svelte-rgrio8"><img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/clip_guidance/merged_clip_guidance.jpg" alt="clip_guidance">.</p> <h3 class="relative group"><a id="one-step-unet" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#one-step-unet"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>One Step Unet</span></h3> <p data-svelte-h="svelte-1ebeix6">예시 “one-step-unet”는 다음과 같이 실행할 수 있습니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> DiffusionPipeline | |
| pipe = DiffusionPipeline.from_pretrained(<span class="hljs-string">"google/ddpm-cifar10-32"</span>, custom_pipeline=<span class="hljs-string">"one_step_unet"</span>) | |
| pipe()<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-5x0zwg"><strong>참고</strong>: 이 커뮤니티 파이프라인은 기능으로 유용하지 않으며 커뮤니티 파이프라인을 추가할 수 있는 방법의 예시일 뿐입니다(<a href="https://github.com/huggingface/diffusers/issues/841" rel="nofollow">https://github.com/huggingface/diffusers/issues/841</a> 참조).</p> <h3 class="relative group"><a id="stable-diffusion-interpolation" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#stable-diffusion-interpolation"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Stable Diffusion Interpolation</span></h3> <p data-svelte-h="svelte-wn9810">다음 코드는 최소 8GB VRAM의 GPU에서 실행할 수 있으며 약 5분 정도 소요됩니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> DiffusionPipeline | |
| <span class="hljs-keyword">import</span> torch | |
| pipe = DiffusionPipeline.from_pretrained( | |
| <span class="hljs-string">"CompVis/stable-diffusion-v1-4"</span>, | |
| torch_dtype=torch.float16, | |
| safety_checker=<span class="hljs-literal">None</span>, <span class="hljs-comment"># Very important for videos...lots of false positives while interpolating</span> | |
| custom_pipeline=<span class="hljs-string">"interpolate_stable_diffusion"</span>, | |
| ).to(<span class="hljs-string">"cuda"</span>) | |
| pipe.enable_attention_slicing() | |
| frame_filepaths = pipe.walk( | |
| prompts=[<span class="hljs-string">"a dog"</span>, <span class="hljs-string">"a cat"</span>, <span class="hljs-string">"a horse"</span>], | |
| seeds=[<span class="hljs-number">42</span>, <span class="hljs-number">1337</span>, <span class="hljs-number">1234</span>], | |
| num_interpolation_steps=<span class="hljs-number">16</span>, | |
| output_dir=<span class="hljs-string">"./dreams"</span>, | |
| batch_size=<span class="hljs-number">4</span>, | |
| height=<span class="hljs-number">512</span>, | |
| width=<span class="hljs-number">512</span>, | |
| guidance_scale=<span class="hljs-number">8.5</span>, | |
| num_inference_steps=<span class="hljs-number">50</span>, | |
| )<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1r04986">walk(…)<code>함수의 출력은</code>output_dir`에 정의된 대로 폴더에 저장된 이미지 목록을 반환합니다. 이 이미지를 사용하여 안정적으로 확산되는 동영상을 만들 수 있습니다.</p> <blockquote data-svelte-h="svelte-bcf8d3"><p>안정된 확산을 이용한 동영상 제작 방법과 더 많은 기능에 대한 자세한 내용은 <a href="https://github.com/nateraw/stable-diffusion-videos" rel="nofollow">https://github.com/nateraw/stable-diffusion-videos</a> 에서 확인하시기 바랍니다.</p></blockquote> <h3 class="relative group"><a id="stable-diffusion-mega" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#stable-diffusion-mega"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Stable Diffusion Mega</span></h3> <p data-svelte-h="svelte-1d6jh9z">The Stable Diffusion Mega 파이프라인을 사용하면 Stable Diffusion 파이프라인의 주요 사용 사례를 단일 클래스에서 사용할 수 있습니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-comment">#!/usr/bin/env python3</span> | |
| <span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> DiffusionPipeline | |
| <span class="hljs-keyword">import</span> PIL | |
| <span class="hljs-keyword">import</span> requests | |
| <span class="hljs-keyword">from</span> io <span class="hljs-keyword">import</span> BytesIO | |
| <span class="hljs-keyword">import</span> torch | |
| <span class="hljs-keyword">def</span> <span class="hljs-title function_">download_image</span>(<span class="hljs-params">url</span>): | |
| response = requests.get(url) | |
| <span class="hljs-keyword">return</span> PIL.Image.<span class="hljs-built_in">open</span>(BytesIO(response.content)).convert(<span class="hljs-string">"RGB"</span>) | |
| pipe = DiffusionPipeline.from_pretrained( | |
| <span class="hljs-string">"CompVis/stable-diffusion-v1-4"</span>, | |
| custom_pipeline=<span class="hljs-string">"stable_diffusion_mega"</span>, | |
| torch_dtype=torch.float16, | |
| ) | |
| pipe.to(<span class="hljs-string">"cuda"</span>) | |
| pipe.enable_attention_slicing() | |
| <span class="hljs-comment">### Text-to-Image</span> | |
| images = pipe.text2img(<span class="hljs-string">"An astronaut riding a horse"</span>).images | |
| <span class="hljs-comment">### Image-to-Image</span> | |
| init_image = download_image( | |
| <span class="hljs-string">"https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"</span> | |
| ) | |
| prompt = <span class="hljs-string">"A fantasy landscape, trending on artstation"</span> | |
| images = pipe.img2img(prompt=prompt, image=init_image, strength=<span class="hljs-number">0.75</span>, guidance_scale=<span class="hljs-number">7.5</span>).images | |
| <span class="hljs-comment">### Inpainting</span> | |
| img_url = <span class="hljs-string">"https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"</span> | |
| mask_url = <span class="hljs-string">"https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"</span> | |
| init_image = download_image(img_url).resize((<span class="hljs-number">512</span>, <span class="hljs-number">512</span>)) | |
| mask_image = download_image(mask_url).resize((<span class="hljs-number">512</span>, <span class="hljs-number">512</span>)) | |
| prompt = <span class="hljs-string">"a cat sitting on a bench"</span> | |
| images = pipe.inpaint(prompt=prompt, image=init_image, mask_image=mask_image, strength=<span class="hljs-number">0.75</span>).images<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1eglsg8">위에 표시된 것처럼 하나의 파이프라인에서 ‘텍스트-이미지 변환’, ‘이미지-이미지 변환’, ‘인페인팅’을 모두 실행할 수 있습니다.</p> <h3 class="relative group"><a id="long-prompt-weighting-stable-diffusion" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#long-prompt-weighting-stable-diffusion"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Long Prompt Weighting Stable Diffusion</span></h3> <p data-svelte-h="svelte-1tc5cfu">파이프라인을 사용하면 77개의 토큰 길이 제한 없이 프롬프트를 입력할 수 있습니다. 또한 ”()“를 사용하여 단어 가중치를 높이거나 ”[]“를 사용하여 단어 가중치를 낮출 수 있습니다. | |
| 또한 파이프라인을 사용하면 단일 클래스에서 Stable Diffusion 파이프라인의 주요 사용 사례를 사용할 수 있습니다.</p> <h4 class="relative group"><a id="pytorch" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#pytorch"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>pytorch</span></h4> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> DiffusionPipeline | |
| <span class="hljs-keyword">import</span> torch | |
| pipe = DiffusionPipeline.from_pretrained( | |
| <span class="hljs-string">"hakurei/waifu-diffusion"</span>, custom_pipeline=<span class="hljs-string">"lpw_stable_diffusion"</span>, torch_dtype=torch.float16 | |
| ) | |
| pipe = pipe.to(<span class="hljs-string">"cuda"</span>) | |
| prompt = <span class="hljs-string">"best_quality (1girl:1.3) bow bride brown_hair closed_mouth frilled_bow frilled_hair_tubes frills (full_body:1.3) fox_ear hair_bow hair_tubes happy hood japanese_clothes kimono long_sleeves red_bow smile solo tabi uchikake white_kimono wide_sleeves cherry_blossoms"</span> | |
| neg_prompt = <span class="hljs-string">"lowres, bad_anatomy, error_body, error_hair, error_arm, error_hands, bad_hands, error_fingers, bad_fingers, missing_fingers, error_legs, bad_legs, multiple_legs, missing_legs, error_lighting, error_shadow, error_reflection, text, error, extra_digit, fewer_digits, cropped, worst_quality, low_quality, normal_quality, jpeg_artifacts, signature, watermark, username, blurry"</span> | |
| pipe.text2img(prompt, negative_prompt=neg_prompt, width=<span class="hljs-number">512</span>, height=<span class="hljs-number">512</span>, max_embeddings_multiples=<span class="hljs-number">3</span>).images[<span class="hljs-number">0</span>]<!-- HTML_TAG_END --></pre></div> <h4 class="relative group"><a id="onnxruntime" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#onnxruntime"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>onnxruntime</span></h4> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> DiffusionPipeline | |
| <span class="hljs-keyword">import</span> torch | |
| pipe = DiffusionPipeline.from_pretrained( | |
| <span class="hljs-string">"CompVis/stable-diffusion-v1-4"</span>, | |
| custom_pipeline=<span class="hljs-string">"lpw_stable_diffusion_onnx"</span>, | |
| revision=<span class="hljs-string">"onnx"</span>, | |
| provider=<span class="hljs-string">"CUDAExecutionProvider"</span>, | |
| ) | |
| prompt = <span class="hljs-string">"a photo of an astronaut riding a horse on mars, best quality"</span> | |
| neg_prompt = <span class="hljs-string">"lowres, bad anatomy, error body, error hair, error arm, error hands, bad hands, error fingers, bad fingers, missing fingers, error legs, bad legs, multiple legs, missing legs, error lighting, error shadow, error reflection, text, error, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry"</span> | |
| pipe.text2img(prompt, negative_prompt=neg_prompt, width=<span class="hljs-number">512</span>, height=<span class="hljs-number">512</span>, max_embeddings_multiples=<span class="hljs-number">3</span>).images[<span class="hljs-number">0</span>]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1u20860">토큰 인덱스 시퀀스 길이가 이 모델에 지정된 최대 시퀀스 길이보다 길면(*** > 77). 이 시퀀스를 모델에서 실행하면 인덱싱 오류가 발생합니다`. 정상적인 현상이니 걱정하지 마세요.</p> <h3 class="relative group"><a id="speech-to-image" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#speech-to-image"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Speech to Image</span></h3> <p data-svelte-h="svelte-mnvjvu">다음 코드는 사전학습된 OpenAI whisper-small과 Stable Diffusion을 사용하여 오디오 샘플에서 이미지를 생성할 수 있습니다.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> torch | |
| <span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt | |
| <span class="hljs-keyword">from</span> datasets <span class="hljs-keyword">import</span> load_dataset | |
| <span class="hljs-keyword">from</span> diffusers <span class="hljs-keyword">import</span> DiffusionPipeline | |
| <span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> ( | |
| WhisperForConditionalGeneration, | |
| WhisperProcessor, | |
| ) | |
| device = <span class="hljs-string">"cuda"</span> <span class="hljs-keyword">if</span> torch.cuda.is_available() <span class="hljs-keyword">else</span> <span class="hljs-string">"cpu"</span> | |
| ds = load_dataset(<span class="hljs-string">"hf-internal-testing/librispeech_asr_dummy"</span>, <span class="hljs-string">"clean"</span>, split=<span class="hljs-string">"validation"</span>) | |
| audio_sample = ds[<span class="hljs-number">3</span>] | |
| text = audio_sample[<span class="hljs-string">"text"</span>].lower() | |
| speech_data = audio_sample[<span class="hljs-string">"audio"</span>][<span class="hljs-string">"array"</span>] | |
| model = WhisperForConditionalGeneration.from_pretrained(<span class="hljs-string">"openai/whisper-small"</span>).to(device) | |
| processor = WhisperProcessor.from_pretrained(<span class="hljs-string">"openai/whisper-small"</span>) | |
| diffuser_pipeline = DiffusionPipeline.from_pretrained( | |
| <span class="hljs-string">"CompVis/stable-diffusion-v1-4"</span>, | |
| custom_pipeline=<span class="hljs-string">"speech_to_image_diffusion"</span>, | |
| speech_model=model, | |
| speech_processor=processor, | |
| torch_dtype=torch.float16, | |
| ) | |
| diffuser_pipeline.enable_attention_slicing() | |
| diffuser_pipeline = diffuser_pipeline.to(device) | |
| output = diffuser_pipeline(speech_data) | |
| plt.imshow(output.images[<span class="hljs-number">0</span>])<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-yk4vkv">위 예시는 다음의 결과 이미지를 보입니다.</p> <p data-svelte-h="svelte-pca3uz"><img src="https://user-images.githubusercontent.com/45072645/196901736-77d9c6fc-63ee-4072-90b0-dc8b903d63e3.png" alt="image"></p> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/diffusers/blob/main/docs/source/ko/using-diffusers/custom_pipeline_examples.md" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_198thv9 = { | |
| assets: "/docs/diffusers/main/ko", | |
| base: "/docs/diffusers/main/ko", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/diffusers/main/ko/_app/immutable/entry/start.0574fe93.js"), | |
| import("/docs/diffusers/main/ko/_app/immutable/entry/app.f5608ddb.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 40], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 46.2 kB
- Xet hash:
- d9237aaa65fd554b2d8e0517cd1d894057a283917f95e4bf1e906c2b5bc5d61f
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.