Buckets:
| # Guiders | |
| Guiders are components in Modular Diffusers that control how the diffusion process is guided during generation. They implement various guidance techniques to improve generation quality and control. | |
| ## BaseGuidance[[diffusers.guiders.guider_utils.BaseGuidance]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.guiders.guider_utils.BaseGuidance</name><anchor>diffusers.guiders.guider_utils.BaseGuidance</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/guider_utils.py#L36</source><parameters>[{"name": "start", "val": ": float = 0.0"}, {"name": "stop", "val": ": float = 1.0"}]</parameters></docstring> | |
| Base class providing the skeleton for implementing guidance techniques. | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>cleanup_models</name><anchor>diffusers.guiders.guider_utils.BaseGuidance.cleanup_models</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/guider_utils.py#L119</source><parameters>[{"name": "denoiser", "val": ": Module"}]</parameters></docstring> | |
| Cleans up the models for the guidance technique after a given batch of data. This method should be overridden | |
| in subclasses to implement specific model cleanup logic. It is useful for removing any hooks or other stateful | |
| modifications made during `prepare_models`. | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>from_pretrained</name><anchor>diffusers.guiders.guider_utils.BaseGuidance.from_pretrained</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/guider_utils.py#L204</source><parameters>[{"name": "pretrained_model_name_or_path", "val": ": typing.Union[str, os.PathLike, NoneType] = None"}, {"name": "subfolder", "val": ": typing.Optional[str] = None"}, {"name": "return_unused_kwargs", "val": " = False"}, {"name": "**kwargs", "val": ""}]</parameters><paramsdesc>- **pretrained_model_name_or_path** (`str` or `os.PathLike`, *optional*) -- | |
| Can be either: | |
| - A string, the *model id* (for example `google/ddpm-celebahq-256`) of a pretrained model hosted on | |
| the Hub. | |
| - A path to a *directory* (for example `./my_model_directory`) containing the guider configuration | |
| saved with `~BaseGuidance.save_pretrained`. | |
| - **subfolder** (`str`, *optional*) -- | |
| The subfolder location of a model file within a larger model repository on the Hub or locally. | |
| - **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) -- | |
| Whether kwargs that are not consumed by the Python class should be returned or not. | |
| - **cache_dir** (`Union[str, os.PathLike]`, *optional*) -- | |
| Path to a directory where a downloaded pretrained model configuration is cached if the standard cache | |
| is not used. | |
| - **force_download** (`bool`, *optional*, defaults to `False`) -- | |
| Whether or not to force the (re-)download of the model weights and configuration files, overriding the | |
| cached versions if they exist. | |
| - **proxies** (`Dict[str, str]`, *optional*) -- | |
| A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128', | |
| 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. | |
| - **output_loading_info(`bool`,** *optional*, defaults to `False`) -- | |
| Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. | |
| - **local_files_only(`bool`,** *optional*, defaults to `False`) -- | |
| Whether to only load local model weights and configuration files or not. If set to `True`, the model | |
| won't be downloaded from the Hub. | |
| - **token** (`str` or *bool*, *optional*) -- | |
| The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from | |
| `diffusers-cli login` (stored in `~/.huggingface`) is used. | |
| - **revision** (`str`, *optional*, defaults to `"main"`) -- | |
| The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier | |
| allowed by Git.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Instantiate a guider from a pre-defined JSON configuration file in a local directory or Hub repository. | |
| > [!TIP] > To use private or [gated models](https://huggingface.co/docs/hub/models-gated#gated-models), log-in | |
| with `hf > auth login`. You can also activate the special > | |
| ["offline-mode"](https://huggingface.co/diffusers/installation.html#offline-mode) to use this method in a > | |
| firewalled environment. | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>prepare_models</name><anchor>diffusers.guiders.guider_utils.BaseGuidance.prepare_models</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/guider_utils.py#L112</source><parameters>[{"name": "denoiser", "val": ": Module"}]</parameters></docstring> | |
| Prepares the models for the guidance technique on a given batch of data. This method should be overridden in | |
| subclasses to implement specific model preparation logic. | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>save_pretrained</name><anchor>diffusers.guiders.guider_utils.BaseGuidance.save_pretrained</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/guider_utils.py#L265</source><parameters>[{"name": "save_directory", "val": ": typing.Union[str, os.PathLike]"}, {"name": "push_to_hub", "val": ": bool = False"}, {"name": "**kwargs", "val": ""}]</parameters><paramsdesc>- **save_directory** (`str` or `os.PathLike`) -- | |
| Directory where the configuration JSON file will be saved (will be created if it does not exist). | |
| - **push_to_hub** (`bool`, *optional*, defaults to `False`) -- | |
| Whether or not to push your model to the Hugging Face Hub after saving it. You can specify the | |
| repository you want to push to with `repo_id` (will default to the name of `save_directory` in your | |
| namespace). | |
| - **kwargs** (`Dict[str, Any]`, *optional*) -- | |
| Additional keyword arguments passed along to the [push_to_hub()](/docs/diffusers/pr_12229/en/api/schedulers/overview#diffusers.utils.PushToHubMixin.push_to_hub) method.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Save a guider configuration object to a directory so that it can be reloaded using the | |
| `~BaseGuidance.from_pretrained` class method. | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>set_input_fields</name><anchor>diffusers.guiders.guider_utils.BaseGuidance.set_input_fields</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/guider_utils.py#L75</source><parameters>[{"name": "**kwargs", "val": ": typing.Dict[str, typing.Union[str, typing.Tuple[str, str]]]"}]</parameters><paramsdesc>- ****kwargs** (`Dict[str, Union[str, Tuple[str, str]]]`) -- | |
| A dictionary where the keys are the names of the fields that will be used to store the data once it is | |
| prepared with `prepare_inputs`. The values can be either a string or a tuple of length 2, which is used | |
| to look up the required data provided for preparation. | |
| If a string is provided, it will be used as the conditional data (or unconditional if used with a | |
| guidance method that requires it). If a tuple of length 2 is provided, the first element must be the | |
| conditional data identifier and the second element must be the unconditional data identifier or None. | |
| Example: | |
| ``` | |
| data = {"prompt_embeds": <some tensor>, "negative_prompt_embeds": <some tensor>, "latents": <some tensor>} | |
| BaseGuidance.set_input_fields( | |
| latents="latents", | |
| prompt_embeds=("prompt_embeds", "negative_prompt_embeds"), | |
| ) | |
| ```</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Set the input fields for the guidance technique. The input fields are used to specify the names of the returned | |
| attributes containing the prepared data after `prepare_inputs` is called. The prepared data is obtained from | |
| the values of the provided keyword arguments to this method. | |
| </div></div> | |
| ## ClassifierFreeGuidance[[diffusers.ClassifierFreeGuidance]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.ClassifierFreeGuidance</name><anchor>diffusers.ClassifierFreeGuidance</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/classifier_free_guidance.py#L28</source><parameters>[{"name": "guidance_scale", "val": ": float = 7.5"}, {"name": "guidance_rescale", "val": ": float = 0.0"}, {"name": "use_original_formulation", "val": ": bool = False"}, {"name": "start", "val": ": float = 0.0"}, {"name": "stop", "val": ": float = 1.0"}]</parameters><paramsdesc>- **guidance_scale** (`float`, defaults to `7.5`) -- | |
| The scale parameter for classifier-free guidance. Higher values result in stronger conditioning on the text | |
| prompt, while lower values allow for more freedom in generation. Higher values may lead to saturation and | |
| deterioration of image quality. | |
| - **guidance_rescale** (`float`, defaults to `0.0`) -- | |
| The rescale factor applied to the noise predictions. This is used to improve image quality and fix | |
| overexposure. Based on Section 3.4 from [Common Diffusion Noise Schedules and Sample Steps are | |
| Flawed](https://huggingface.co/papers/2305.08891). | |
| - **use_original_formulation** (`bool`, defaults to `False`) -- | |
| Whether to use the original formulation of classifier-free guidance as proposed in the paper. By default, | |
| we use the diffusers-native implementation that has been in the codebase for a long time. See | |
| [~guiders.classifier_free_guidance.ClassifierFreeGuidance] for more details. | |
| - **start** (`float`, defaults to `0.0`) -- | |
| The fraction of the total number of denoising steps after which guidance starts. | |
| - **stop** (`float`, defaults to `1.0`) -- | |
| The fraction of the total number of denoising steps after which guidance stops.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Classifier-free guidance (CFG): https://huggingface.co/papers/2207.12598 | |
| CFG is a technique used to improve generation quality and condition-following in diffusion models. It works by | |
| jointly training a model on both conditional and unconditional data, and using a weighted sum of the two during | |
| inference. This allows the model to tradeoff between generation quality and sample diversity. The original paper | |
| proposes scaling and shifting the conditional distribution based on the difference between conditional and | |
| unconditional predictions. [x_pred = x_cond + scale * (x_cond - x_uncond)] | |
| Diffusers implemented the scaling and shifting on the unconditional prediction instead based on the [Imagen | |
| paper](https://huggingface.co/papers/2205.11487), which is equivalent to what the original paper proposed in | |
| theory. [x_pred = x_uncond + scale * (x_cond - x_uncond)] | |
| The intution behind the original formulation can be thought of as moving the conditional distribution estimates | |
| further away from the unconditional distribution estimates, while the diffusers-native implementation can be | |
| thought of as moving the unconditional distribution towards the conditional distribution estimates to get rid of | |
| the unconditional predictions (usually negative features like "bad quality, bad anotomy, watermarks", etc.) | |
| The `use_original_formulation` argument can be set to `True` to use the original CFG formulation mentioned in the | |
| paper. By default, we use the diffusers-native implementation that has been in the codebase for a long time. | |
| </div> | |
| ## ClassifierFreeZeroStarGuidance[[diffusers.ClassifierFreeZeroStarGuidance]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.ClassifierFreeZeroStarGuidance</name><anchor>diffusers.ClassifierFreeZeroStarGuidance</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/classifier_free_zero_star_guidance.py#L28</source><parameters>[{"name": "guidance_scale", "val": ": float = 7.5"}, {"name": "zero_init_steps", "val": ": int = 1"}, {"name": "guidance_rescale", "val": ": float = 0.0"}, {"name": "use_original_formulation", "val": ": bool = False"}, {"name": "start", "val": ": float = 0.0"}, {"name": "stop", "val": ": float = 1.0"}]</parameters><paramsdesc>- **guidance_scale** (`float`, defaults to `7.5`) -- | |
| The scale parameter for classifier-free guidance. Higher values result in stronger conditioning on the text | |
| prompt, while lower values allow for more freedom in generation. Higher values may lead to saturation and | |
| deterioration of image quality. | |
| - **zero_init_steps** (`int`, defaults to `1`) -- | |
| The number of inference steps for which the noise predictions are zeroed out (see Section 4.2). | |
| - **guidance_rescale** (`float`, defaults to `0.0`) -- | |
| The rescale factor applied to the noise predictions. This is used to improve image quality and fix | |
| overexposure. Based on Section 3.4 from [Common Diffusion Noise Schedules and Sample Steps are | |
| Flawed](https://huggingface.co/papers/2305.08891). | |
| - **use_original_formulation** (`bool`, defaults to `False`) -- | |
| Whether to use the original formulation of classifier-free guidance as proposed in the paper. By default, | |
| we use the diffusers-native implementation that has been in the codebase for a long time. See | |
| [~guiders.classifier_free_guidance.ClassifierFreeGuidance] for more details. | |
| - **start** (`float`, defaults to `0.01`) -- | |
| The fraction of the total number of denoising steps after which guidance starts. | |
| - **stop** (`float`, defaults to `0.2`) -- | |
| The fraction of the total number of denoising steps after which guidance stops.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Classifier-free Zero* (CFG-Zero*): https://huggingface.co/papers/2503.18886 | |
| This is an implementation of the Classifier-Free Zero* guidance technique, which is a variant of classifier-free | |
| guidance. It proposes zero initialization of the noise predictions for the first few steps of the diffusion | |
| process, and also introduces an optimal rescaling factor for the noise predictions, which can help in improving the | |
| quality of generated images. | |
| The authors of the paper suggest setting zero initialization in the first 4% of the inference steps. | |
| </div> | |
| ## SkipLayerGuidance[[diffusers.SkipLayerGuidance]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.SkipLayerGuidance</name><anchor>diffusers.SkipLayerGuidance</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/skip_layer_guidance.py#L30</source><parameters>[{"name": "guidance_scale", "val": ": float = 7.5"}, {"name": "skip_layer_guidance_scale", "val": ": float = 2.8"}, {"name": "skip_layer_guidance_start", "val": ": float = 0.01"}, {"name": "skip_layer_guidance_stop", "val": ": float = 0.2"}, {"name": "skip_layer_guidance_layers", "val": ": typing.Union[int, typing.List[int], NoneType] = None"}, {"name": "skip_layer_config", "val": ": typing.Union[diffusers.hooks.layer_skip.LayerSkipConfig, typing.List[diffusers.hooks.layer_skip.LayerSkipConfig], typing.Dict[str, typing.Any]] = None"}, {"name": "guidance_rescale", "val": ": float = 0.0"}, {"name": "use_original_formulation", "val": ": bool = False"}, {"name": "start", "val": ": float = 0.0"}, {"name": "stop", "val": ": float = 1.0"}]</parameters><paramsdesc>- **guidance_scale** (`float`, defaults to `7.5`) -- | |
| The scale parameter for classifier-free guidance. Higher values result in stronger conditioning on the text | |
| prompt, while lower values allow for more freedom in generation. Higher values may lead to saturation and | |
| deterioration of image quality. | |
| - **skip_layer_guidance_scale** (`float`, defaults to `2.8`) -- | |
| The scale parameter for skip layer guidance. Anatomy and structure coherence may improve with higher | |
| values, but it may also lead to overexposure and saturation. | |
| - **skip_layer_guidance_start** (`float`, defaults to `0.01`) -- | |
| The fraction of the total number of denoising steps after which skip layer guidance starts. | |
| - **skip_layer_guidance_stop** (`float`, defaults to `0.2`) -- | |
| The fraction of the total number of denoising steps after which skip layer guidance stops. | |
| - **skip_layer_guidance_layers** (`int` or `List[int]`, *optional*) -- | |
| The layer indices to apply skip layer guidance to. Can be a single integer or a list of integers. If not | |
| provided, `skip_layer_config` must be provided. The recommended values are `[7, 8, 9]` for Stable Diffusion | |
| 3.5 Medium. | |
| - **skip_layer_config** (`LayerSkipConfig` or `List[LayerSkipConfig]`, *optional*) -- | |
| The configuration for the skip layer guidance. Can be a single `LayerSkipConfig` or a list of | |
| `LayerSkipConfig`. If not provided, `skip_layer_guidance_layers` must be provided. | |
| - **guidance_rescale** (`float`, defaults to `0.0`) -- | |
| The rescale factor applied to the noise predictions. This is used to improve image quality and fix | |
| overexposure. Based on Section 3.4 from [Common Diffusion Noise Schedules and Sample Steps are | |
| Flawed](https://huggingface.co/papers/2305.08891). | |
| - **use_original_formulation** (`bool`, defaults to `False`) -- | |
| Whether to use the original formulation of classifier-free guidance as proposed in the paper. By default, | |
| we use the diffusers-native implementation that has been in the codebase for a long time. See | |
| [~guiders.classifier_free_guidance.ClassifierFreeGuidance] for more details. | |
| - **start** (`float`, defaults to `0.01`) -- | |
| The fraction of the total number of denoising steps after which guidance starts. | |
| - **stop** (`float`, defaults to `0.2`) -- | |
| The fraction of the total number of denoising steps after which guidance stops.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Skip Layer Guidance (SLG): https://github.com/Stability-AI/sd3.5 | |
| Spatio-Temporal Guidance (STG): https://huggingface.co/papers/2411.18664 | |
| SLG was introduced by StabilityAI for improving structure and anotomy coherence in generated images. It works by | |
| skipping the forward pass of specified transformer blocks during the denoising process on an additional conditional | |
| batch of data, apart from the conditional and unconditional batches already used in CFG | |
| ([~guiders.classifier_free_guidance.ClassifierFreeGuidance]), and then scaling and shifting the CFG predictions | |
| based on the difference between conditional without skipping and conditional with skipping predictions. | |
| The intution behind SLG can be thought of as moving the CFG predicted distribution estimates further away from | |
| worse versions of the conditional distribution estimates (because skipping layers is equivalent to using a worse | |
| version of the model for the conditional prediction). | |
| STG is an improvement and follow-up work combining ideas from SLG, PAG and similar techniques for improving | |
| generation quality in video diffusion models. | |
| Additional reading: | |
| - [Guiding a Diffusion Model with a Bad Version of Itself](https://huggingface.co/papers/2406.02507) | |
| The values for `skip_layer_guidance_scale`, `skip_layer_guidance_start`, and `skip_layer_guidance_stop` are | |
| defaulted to the recommendations by StabilityAI for Stable Diffusion 3.5 Medium. | |
| </div> | |
| ## SmoothedEnergyGuidance[[diffusers.SmoothedEnergyGuidance]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.SmoothedEnergyGuidance</name><anchor>diffusers.SmoothedEnergyGuidance</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/smoothed_energy_guidance.py#L30</source><parameters>[{"name": "guidance_scale", "val": ": float = 7.5"}, {"name": "seg_guidance_scale", "val": ": float = 2.8"}, {"name": "seg_blur_sigma", "val": ": float = 9999999.0"}, {"name": "seg_blur_threshold_inf", "val": ": float = 9999.0"}, {"name": "seg_guidance_start", "val": ": float = 0.0"}, {"name": "seg_guidance_stop", "val": ": float = 1.0"}, {"name": "seg_guidance_layers", "val": ": typing.Union[int, typing.List[int], NoneType] = None"}, {"name": "seg_guidance_config", "val": ": typing.Union[diffusers.hooks.smoothed_energy_guidance_utils.SmoothedEnergyGuidanceConfig, typing.List[diffusers.hooks.smoothed_energy_guidance_utils.SmoothedEnergyGuidanceConfig]] = None"}, {"name": "guidance_rescale", "val": ": float = 0.0"}, {"name": "use_original_formulation", "val": ": bool = False"}, {"name": "start", "val": ": float = 0.0"}, {"name": "stop", "val": ": float = 1.0"}]</parameters><paramsdesc>- **guidance_scale** (`float`, defaults to `7.5`) -- | |
| The scale parameter for classifier-free guidance. Higher values result in stronger conditioning on the text | |
| prompt, while lower values allow for more freedom in generation. Higher values may lead to saturation and | |
| deterioration of image quality. | |
| - **seg_guidance_scale** (`float`, defaults to `3.0`) -- | |
| The scale parameter for smoothed energy guidance. Anatomy and structure coherence may improve with higher | |
| values, but it may also lead to overexposure and saturation. | |
| - **seg_blur_sigma** (`float`, defaults to `9999999.0`) -- | |
| The amount by which we blur the attention weights. Setting this value greater than 9999.0 results in | |
| infinite blur, which means uniform queries. Controlling it exponentially is empirically effective. | |
| - **seg_blur_threshold_inf** (`float`, defaults to `9999.0`) -- | |
| The threshold above which the blur is considered infinite. | |
| - **seg_guidance_start** (`float`, defaults to `0.0`) -- | |
| The fraction of the total number of denoising steps after which smoothed energy guidance starts. | |
| - **seg_guidance_stop** (`float`, defaults to `1.0`) -- | |
| The fraction of the total number of denoising steps after which smoothed energy guidance stops. | |
| - **seg_guidance_layers** (`int` or `List[int]`, *optional*) -- | |
| The layer indices to apply smoothed energy guidance to. Can be a single integer or a list of integers. If | |
| not provided, `seg_guidance_config` must be provided. The recommended values are `[7, 8, 9]` for Stable | |
| Diffusion 3.5 Medium. | |
| - **seg_guidance_config** (`SmoothedEnergyGuidanceConfig` or `List[SmoothedEnergyGuidanceConfig]`, *optional*) -- | |
| The configuration for the smoothed energy layer guidance. Can be a single `SmoothedEnergyGuidanceConfig` or | |
| a list of `SmoothedEnergyGuidanceConfig`. If not provided, `seg_guidance_layers` must be provided. | |
| - **guidance_rescale** (`float`, defaults to `0.0`) -- | |
| The rescale factor applied to the noise predictions. This is used to improve image quality and fix | |
| overexposure. Based on Section 3.4 from [Common Diffusion Noise Schedules and Sample Steps are | |
| Flawed](https://huggingface.co/papers/2305.08891). | |
| - **use_original_formulation** (`bool`, defaults to `False`) -- | |
| Whether to use the original formulation of classifier-free guidance as proposed in the paper. By default, | |
| we use the diffusers-native implementation that has been in the codebase for a long time. See | |
| [~guiders.classifier_free_guidance.ClassifierFreeGuidance] for more details. | |
| - **start** (`float`, defaults to `0.01`) -- | |
| The fraction of the total number of denoising steps after which guidance starts. | |
| - **stop** (`float`, defaults to `0.2`) -- | |
| The fraction of the total number of denoising steps after which guidance stops.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Smoothed Energy Guidance (SEG): https://huggingface.co/papers/2408.00760 | |
| SEG is only supported as an experimental prototype feature for now, so the implementation may be modified in the | |
| future without warning or guarantee of reproducibility. This implementation assumes: | |
| - Generated images are square (height == width) | |
| - The model does not combine different modalities together (e.g., text and image latent streams are not combined | |
| together such as Flux) | |
| </div> | |
| ## PerturbedAttentionGuidance[[diffusers.PerturbedAttentionGuidance]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.PerturbedAttentionGuidance</name><anchor>diffusers.PerturbedAttentionGuidance</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/perturbed_attention_guidance.py#L34</source><parameters>[{"name": "guidance_scale", "val": ": float = 7.5"}, {"name": "perturbed_guidance_scale", "val": ": float = 2.8"}, {"name": "perturbed_guidance_start", "val": ": float = 0.01"}, {"name": "perturbed_guidance_stop", "val": ": float = 0.2"}, {"name": "perturbed_guidance_layers", "val": ": typing.Union[int, typing.List[int], NoneType] = None"}, {"name": "perturbed_guidance_config", "val": ": typing.Union[diffusers.hooks.layer_skip.LayerSkipConfig, typing.List[diffusers.hooks.layer_skip.LayerSkipConfig], typing.Dict[str, typing.Any]] = None"}, {"name": "guidance_rescale", "val": ": float = 0.0"}, {"name": "use_original_formulation", "val": ": bool = False"}, {"name": "start", "val": ": float = 0.0"}, {"name": "stop", "val": ": float = 1.0"}]</parameters><paramsdesc>- **guidance_scale** (`float`, defaults to `7.5`) -- | |
| The scale parameter for classifier-free guidance. Higher values result in stronger conditioning on the text | |
| prompt, while lower values allow for more freedom in generation. Higher values may lead to saturation and | |
| deterioration of image quality. | |
| - **perturbed_guidance_scale** (`float`, defaults to `2.8`) -- | |
| The scale parameter for perturbed attention guidance. | |
| - **perturbed_guidance_start** (`float`, defaults to `0.01`) -- | |
| The fraction of the total number of denoising steps after which perturbed attention guidance starts. | |
| - **perturbed_guidance_stop** (`float`, defaults to `0.2`) -- | |
| The fraction of the total number of denoising steps after which perturbed attention guidance stops. | |
| - **perturbed_guidance_layers** (`int` or `List[int]`, *optional*) -- | |
| The layer indices to apply perturbed attention guidance to. Can be a single integer or a list of integers. | |
| If not provided, `perturbed_guidance_config` must be provided. | |
| - **perturbed_guidance_config** (`LayerSkipConfig` or `List[LayerSkipConfig]`, *optional*) -- | |
| The configuration for the perturbed attention guidance. Can be a single `LayerSkipConfig` or a list of | |
| `LayerSkipConfig`. If not provided, `perturbed_guidance_layers` must be provided. | |
| - **guidance_rescale** (`float`, defaults to `0.0`) -- | |
| The rescale factor applied to the noise predictions. This is used to improve image quality and fix | |
| overexposure. Based on Section 3.4 from [Common Diffusion Noise Schedules and Sample Steps are | |
| Flawed](https://huggingface.co/papers/2305.08891). | |
| - **use_original_formulation** (`bool`, defaults to `False`) -- | |
| Whether to use the original formulation of classifier-free guidance as proposed in the paper. By default, | |
| we use the diffusers-native implementation that has been in the codebase for a long time. See | |
| [~guiders.classifier_free_guidance.ClassifierFreeGuidance] for more details. | |
| - **start** (`float`, defaults to `0.01`) -- | |
| The fraction of the total number of denoising steps after which guidance starts. | |
| - **stop** (`float`, defaults to `0.2`) -- | |
| The fraction of the total number of denoising steps after which guidance stops.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Perturbed Attention Guidance (PAG): https://huggingface.co/papers/2403.17377 | |
| The intution behind PAG can be thought of as moving the CFG predicted distribution estimates further away from | |
| worse versions of the conditional distribution estimates. PAG was one of the first techniques to introduce the idea | |
| of using a worse version of the trained model for better guiding itself in the denoising process. It perturbs the | |
| attention scores of the latent stream by replacing the score matrix with an identity matrix for selectively chosen | |
| layers. | |
| Additional reading: | |
| - [Guiding a Diffusion Model with a Bad Version of Itself](https://huggingface.co/papers/2406.02507) | |
| PAG is implemented with similar implementation to SkipLayerGuidance due to overlap in the configuration parameters | |
| and implementation details. | |
| </div> | |
| ## AdaptiveProjectedGuidance[[diffusers.AdaptiveProjectedGuidance]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.AdaptiveProjectedGuidance</name><anchor>diffusers.AdaptiveProjectedGuidance</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/adaptive_projected_guidance.py#L28</source><parameters>[{"name": "guidance_scale", "val": ": float = 7.5"}, {"name": "adaptive_projected_guidance_momentum", "val": ": typing.Optional[float] = None"}, {"name": "adaptive_projected_guidance_rescale", "val": ": float = 15.0"}, {"name": "eta", "val": ": float = 1.0"}, {"name": "guidance_rescale", "val": ": float = 0.0"}, {"name": "use_original_formulation", "val": ": bool = False"}, {"name": "start", "val": ": float = 0.0"}, {"name": "stop", "val": ": float = 1.0"}]</parameters><paramsdesc>- **guidance_scale** (`float`, defaults to `7.5`) -- | |
| The scale parameter for classifier-free guidance. Higher values result in stronger conditioning on the text | |
| prompt, while lower values allow for more freedom in generation. Higher values may lead to saturation and | |
| deterioration of image quality. | |
| - **adaptive_projected_guidance_momentum** (`float`, defaults to `None`) -- | |
| The momentum parameter for the adaptive projected guidance. Disabled if set to `None`. | |
| - **adaptive_projected_guidance_rescale** (`float`, defaults to `15.0`) -- | |
| The rescale factor applied to the noise predictions. This is used to improve image quality and fix | |
| - **guidance_rescale** (`float`, defaults to `0.0`) -- | |
| The rescale factor applied to the noise predictions. This is used to improve image quality and fix | |
| overexposure. Based on Section 3.4 from [Common Diffusion Noise Schedules and Sample Steps are | |
| Flawed](https://huggingface.co/papers/2305.08891). | |
| - **use_original_formulation** (`bool`, defaults to `False`) -- | |
| Whether to use the original formulation of classifier-free guidance as proposed in the paper. By default, | |
| we use the diffusers-native implementation that has been in the codebase for a long time. See | |
| [~guiders.classifier_free_guidance.ClassifierFreeGuidance] for more details. | |
| - **start** (`float`, defaults to `0.0`) -- | |
| The fraction of the total number of denoising steps after which guidance starts. | |
| - **stop** (`float`, defaults to `1.0`) -- | |
| The fraction of the total number of denoising steps after which guidance stops.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Adaptive Projected Guidance (APG): https://huggingface.co/papers/2410.02416 | |
| </div> | |
| ## AutoGuidance[[diffusers.AutoGuidance]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.AutoGuidance</name><anchor>diffusers.AutoGuidance</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/auto_guidance.py#L30</source><parameters>[{"name": "guidance_scale", "val": ": float = 7.5"}, {"name": "auto_guidance_layers", "val": ": typing.Union[int, typing.List[int], NoneType] = None"}, {"name": "auto_guidance_config", "val": ": typing.Union[diffusers.hooks.layer_skip.LayerSkipConfig, typing.List[diffusers.hooks.layer_skip.LayerSkipConfig], typing.Dict[str, typing.Any]] = None"}, {"name": "dropout", "val": ": typing.Optional[float] = None"}, {"name": "guidance_rescale", "val": ": float = 0.0"}, {"name": "use_original_formulation", "val": ": bool = False"}, {"name": "start", "val": ": float = 0.0"}, {"name": "stop", "val": ": float = 1.0"}]</parameters><paramsdesc>- **guidance_scale** (`float`, defaults to `7.5`) -- | |
| The scale parameter for classifier-free guidance. Higher values result in stronger conditioning on the text | |
| prompt, while lower values allow for more freedom in generation. Higher values may lead to saturation and | |
| deterioration of image quality. | |
| - **auto_guidance_layers** (`int` or `List[int]`, *optional*) -- | |
| The layer indices to apply skip layer guidance to. Can be a single integer or a list of integers. If not | |
| provided, `skip_layer_config` must be provided. | |
| - **auto_guidance_config** (`LayerSkipConfig` or `List[LayerSkipConfig]`, *optional*) -- | |
| The configuration for the skip layer guidance. Can be a single `LayerSkipConfig` or a list of | |
| `LayerSkipConfig`. If not provided, `skip_layer_guidance_layers` must be provided. | |
| - **dropout** (`float`, *optional*) -- | |
| The dropout probability for autoguidance on the enabled skip layers (either with `auto_guidance_layers` or | |
| `auto_guidance_config`). If not provided, the dropout probability will be set to 1.0. | |
| - **guidance_rescale** (`float`, defaults to `0.0`) -- | |
| The rescale factor applied to the noise predictions. This is used to improve image quality and fix | |
| overexposure. Based on Section 3.4 from [Common Diffusion Noise Schedules and Sample Steps are | |
| Flawed](https://huggingface.co/papers/2305.08891). | |
| - **use_original_formulation** (`bool`, defaults to `False`) -- | |
| Whether to use the original formulation of classifier-free guidance as proposed in the paper. By default, | |
| we use the diffusers-native implementation that has been in the codebase for a long time. See | |
| [~guiders.classifier_free_guidance.ClassifierFreeGuidance] for more details. | |
| - **start** (`float`, defaults to `0.0`) -- | |
| The fraction of the total number of denoising steps after which guidance starts. | |
| - **stop** (`float`, defaults to `1.0`) -- | |
| The fraction of the total number of denoising steps after which guidance stops.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| AutoGuidance: https://huggingface.co/papers/2406.02507 | |
| </div> | |
| ## TangentialClassifierFreeGuidance[[diffusers.TangentialClassifierFreeGuidance]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.TangentialClassifierFreeGuidance</name><anchor>diffusers.TangentialClassifierFreeGuidance</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/guiders/tangential_classifier_free_guidance.py#L28</source><parameters>[{"name": "guidance_scale", "val": ": float = 7.5"}, {"name": "guidance_rescale", "val": ": float = 0.0"}, {"name": "use_original_formulation", "val": ": bool = False"}, {"name": "start", "val": ": float = 0.0"}, {"name": "stop", "val": ": float = 1.0"}]</parameters><paramsdesc>- **guidance_scale** (`float`, defaults to `7.5`) -- | |
| The scale parameter for classifier-free guidance. Higher values result in stronger conditioning on the text | |
| prompt, while lower values allow for more freedom in generation. Higher values may lead to saturation and | |
| deterioration of image quality. | |
| - **guidance_rescale** (`float`, defaults to `0.0`) -- | |
| The rescale factor applied to the noise predictions. This is used to improve image quality and fix | |
| overexposure. Based on Section 3.4 from [Common Diffusion Noise Schedules and Sample Steps are | |
| Flawed](https://huggingface.co/papers/2305.08891). | |
| - **use_original_formulation** (`bool`, defaults to `False`) -- | |
| Whether to use the original formulation of classifier-free guidance as proposed in the paper. By default, | |
| we use the diffusers-native implementation that has been in the codebase for a long time. See | |
| [~guiders.classifier_free_guidance.ClassifierFreeGuidance] for more details. | |
| - **start** (`float`, defaults to `0.0`) -- | |
| The fraction of the total number of denoising steps after which guidance starts. | |
| - **stop** (`float`, defaults to `1.0`) -- | |
| The fraction of the total number of denoising steps after which guidance stops.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Tangential Classifier Free Guidance (TCFG): https://huggingface.co/papers/2503.18137 | |
| </div> | |
| <EditOnGithub source="https://github.com/huggingface/diffusers/blob/main/docs/source/en/api/modular_diffusers/guiders.md" /> |
Xet Storage Details
- Size:
- 36.2 kB
- Xet hash:
- b532e81e0badc307cd33b3c58b5e8553e2474fcf3a4360da3edcdf131a4f6ef7
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.