Buckets:

rtrm's picture
|
download
raw
16.3 kB
# Utilities
Utility and helper functions for working with 🤗 Diffusers.
## numpy_to_pil[[diffusers.utils.numpy_to_pil]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>diffusers.utils.numpy_to_pil</name><anchor>diffusers.utils.numpy_to_pil</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/utils/pil_utils.py#L37</source><parameters>[{"name": "images", "val": ""}]</parameters></docstring>
Convert a numpy image or a batch of images to a PIL image.
</div>
## pt_to_pil[[diffusers.utils.pt_to_pil]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>diffusers.utils.pt_to_pil</name><anchor>diffusers.utils.pt_to_pil</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/utils/pil_utils.py#L27</source><parameters>[{"name": "images", "val": ""}]</parameters></docstring>
Convert a torch image to a PIL image.
</div>
## load_image[[diffusers.utils.load_image]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>diffusers.utils.load_image</name><anchor>diffusers.utils.load_image</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/utils/loading_utils.py#L14</source><parameters>[{"name": "image", "val": ": typing.Union[str, PIL.Image.Image]"}, {"name": "convert_method", "val": ": typing.Optional[typing.Callable[[PIL.Image.Image], PIL.Image.Image]] = None"}]</parameters><paramsdesc>- **image** (`str` or `PIL.Image.Image`) --
The image to convert to the PIL Image format.
- **convert_method** (Callable[[PIL.Image.Image], PIL.Image.Image], *optional*) --
A conversion method to apply to the image after loading it. When set to `None` the image will be converted
"RGB".</paramsdesc><paramgroups>0</paramgroups><rettype>`PIL.Image.Image`</rettype><retdesc>A PIL Image.</retdesc></docstring>
Loads `image` to a PIL Image.
</div>
## load_video[[diffusers.utils.load_video]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>diffusers.utils.load_video</name><anchor>diffusers.utils.load_video</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/utils/loading_utils.py#L57</source><parameters>[{"name": "video", "val": ": str"}, {"name": "convert_method", "val": ": typing.Optional[typing.Callable[[typing.List[PIL.Image.Image]], typing.List[PIL.Image.Image]]] = None"}]</parameters><paramsdesc>- **video** (`str`) --
A URL or Path to a video to convert to a list of PIL Image format.
- **convert_method** (Callable[[List[PIL.Image.Image]], List[PIL.Image.Image]], *optional*) --
A conversion method to apply to the video after loading it. When set to `None` the images will be converted
to "RGB".</paramsdesc><paramgroups>0</paramgroups><rettype>`List[PIL.Image.Image]`</rettype><retdesc>The video as a list of PIL images.</retdesc></docstring>
Loads `video` to a list of PIL Image.
</div>
## export_to_gif[[diffusers.utils.export_to_gif]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>diffusers.utils.export_to_gif</name><anchor>diffusers.utils.export_to_gif</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/utils/export_utils.py#L28</source><parameters>[{"name": "image", "val": ": typing.List[PIL.Image.Image]"}, {"name": "output_gif_path", "val": ": str = None"}, {"name": "fps", "val": ": int = 10"}]</parameters></docstring>
</div>
## export_to_video[[diffusers.utils.export_to_video]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>diffusers.utils.export_to_video</name><anchor>diffusers.utils.export_to_video</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/utils/export_utils.py#L141</source><parameters>[{"name": "video_frames", "val": ": typing.Union[typing.List[numpy.ndarray], typing.List[PIL.Image.Image]]"}, {"name": "output_video_path", "val": ": str = None"}, {"name": "fps", "val": ": int = 10"}, {"name": "quality", "val": ": float = 5.0"}, {"name": "bitrate", "val": ": typing.Optional[int] = None"}, {"name": "macro_block_size", "val": ": typing.Optional[int] = 16"}]</parameters></docstring>
quality:
Video output quality. Default is 5. Uses variable bit rate. Highest quality is 10, lowest is 0. Set to None to
prevent variable bitrate flags to FFMPEG so you can manually specify them using output_params instead.
Specifying a fixed bitrate using `bitrate` disables this parameter.
bitrate:
Set a constant bitrate for the video encoding. Default is None causing `quality` parameter to be used instead.
Better quality videos with smaller file sizes will result from using the `quality` variable bitrate parameter
rather than specifying a fixed bitrate with this parameter.
macro_block_size:
Size constraint for video. Width and height, must be divisible by this number. If not divisible by this number
imageio will tell ffmpeg to scale the image up to the next closest size divisible by this number. Most codecs
are compatible with a macroblock size of 16 (default), some can go smaller (4, 8). To disable this automatic
feature set it to None or 1, however be warned many players can't decode videos that are odd in size and some
codecs will produce poor results or fail. See https://en.wikipedia.org/wiki/Macroblock.
</div>
## make_image_grid[[diffusers.utils.make_image_grid]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>diffusers.utils.make_image_grid</name><anchor>diffusers.utils.make_image_grid</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/utils/pil_utils.py#L53</source><parameters>[{"name": "images", "val": ": typing.List[PIL.Image.Image]"}, {"name": "rows", "val": ": int"}, {"name": "cols", "val": ": int"}, {"name": "resize", "val": ": int = None"}]</parameters></docstring>
Prepares a single grid of images. Useful for visualization purposes.
</div>
## randn_tensor[[diffusers.utils.torch_utils.randn_tensor]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>diffusers.utils.torch_utils.randn_tensor</name><anchor>diffusers.utils.torch_utils.randn_tensor</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/utils/torch_utils.py#L146</source><parameters>[{"name": "shape", "val": ": typing.Union[typing.Tuple, typing.List]"}, {"name": "generator", "val": ": typing.Union[typing.List[ForwardRef('torch.Generator')], ForwardRef('torch.Generator'), NoneType] = None"}, {"name": "device", "val": ": typing.Union[str, ForwardRef('torch.device'), NoneType] = None"}, {"name": "dtype", "val": ": typing.Optional[ForwardRef('torch.dtype')] = None"}, {"name": "layout", "val": ": typing.Optional[ForwardRef('torch.layout')] = None"}]</parameters></docstring>
A helper function to create random tensors on the desired `device` with the desired `dtype`. When
passing a list of generators, you can seed each batch size individually. If CPU generators are passed, the tensor
is always created on the CPU.
</div>
## apply_layerwise_casting[[diffusers.hooks.apply_layerwise_casting]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>diffusers.hooks.apply_layerwise_casting</name><anchor>diffusers.hooks.apply_layerwise_casting</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/hooks/layerwise_casting.py#L101</source><parameters>[{"name": "module", "val": ": Module"}, {"name": "storage_dtype", "val": ": dtype"}, {"name": "compute_dtype", "val": ": dtype"}, {"name": "skip_modules_pattern", "val": ": typing.Union[str, typing.Tuple[str, ...]] = 'auto'"}, {"name": "skip_modules_classes", "val": ": typing.Optional[typing.Tuple[typing.Type[torch.nn.modules.module.Module], ...]] = None"}, {"name": "non_blocking", "val": ": bool = False"}]</parameters><paramsdesc>- **module** (`torch.nn.Module`) --
The module whose leaf modules will be cast to a high precision dtype for computation, and to a low
precision dtype for storage.
- **storage_dtype** (`torch.dtype`) --
The dtype to cast the module to before/after the forward pass for storage.
- **compute_dtype** (`torch.dtype`) --
The dtype to cast the module to during the forward pass for computation.
- **skip_modules_pattern** (`Tuple[str, ...]`, defaults to `"auto"`) --
A list of patterns to match the names of the modules to skip during the layerwise casting process. If set
to `"auto"`, the default patterns are used. If set to `None`, no modules are skipped. If set to `None`
alongside `skip_modules_classes` being `None`, the layerwise casting is applied directly to the module
instead of its internal submodules.
- **skip_modules_classes** (`Tuple[Type[torch.nn.Module], ...]`, defaults to `None`) --
A list of module classes to skip during the layerwise casting process.
- **non_blocking** (`bool`, defaults to `False`) --
If `True`, the weight casting operations are non-blocking.</paramsdesc><paramgroups>0</paramgroups></docstring>
Applies layerwise casting to a given module. The module expected here is a Diffusers ModelMixin but it can be any
nn.Module using diffusers layers or pytorch primitives.
<ExampleCodeBlock anchor="diffusers.hooks.apply_layerwise_casting.example">
Example:
```python
>>> import torch
>>> from diffusers import CogVideoXTransformer3DModel
>>> transformer = CogVideoXTransformer3DModel.from_pretrained(
... model_id, subfolder="transformer", torch_dtype=torch.bfloat16
... )
>>> apply_layerwise_casting(
... transformer,
... storage_dtype=torch.float8_e4m3fn,
... compute_dtype=torch.bfloat16,
... skip_modules_pattern=["patch_embed", "norm", "proj_out"],
... non_blocking=True,
... )
```
</ExampleCodeBlock>
</div>
## apply_group_offloading[[diffusers.hooks.apply_group_offloading]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>diffusers.hooks.apply_group_offloading</name><anchor>diffusers.hooks.apply_group_offloading</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/hooks/group_offloading.py#L445</source><parameters>[{"name": "module", "val": ": Module"}, {"name": "onload_device", "val": ": typing.Union[str, torch.device]"}, {"name": "offload_device", "val": ": typing.Union[str, torch.device] = device(type='cpu')"}, {"name": "offload_type", "val": ": typing.Union[str, diffusers.hooks.group_offloading.GroupOffloadingType] = 'block_level'"}, {"name": "num_blocks_per_group", "val": ": typing.Optional[int] = None"}, {"name": "non_blocking", "val": ": bool = False"}, {"name": "use_stream", "val": ": bool = False"}, {"name": "record_stream", "val": ": bool = False"}, {"name": "low_cpu_mem_usage", "val": ": bool = False"}, {"name": "offload_to_disk_path", "val": ": typing.Optional[str] = None"}]</parameters><paramsdesc>- **module** (`torch.nn.Module`) --
The module to which group offloading is applied.
- **onload_device** (`torch.device`) --
The device to which the group of modules are onloaded.
- **offload_device** (`torch.device`, defaults to `torch.device("cpu")`) --
The device to which the group of modules are offloaded. This should typically be the CPU. Default is CPU.
- **offload_type** (`str` or `GroupOffloadingType`, defaults to "block_level") --
The type of offloading to be applied. Can be one of "block_level" or "leaf_level". Default is
"block_level".
- **offload_to_disk_path** (`str`, *optional*, defaults to `None`) --
The path to the directory where parameters will be offloaded. Setting this option can be useful in limited
RAM environment settings where a reasonable speed-memory trade-off is desired.
- **num_blocks_per_group** (`int`, *optional*) --
The number of blocks per group when using offload_type="block_level". This is required when using
offload_type="block_level".
- **non_blocking** (`bool`, defaults to `False`) --
If True, offloading and onloading is done with non-blocking data transfer.
- **use_stream** (`bool`, defaults to `False`) --
If True, offloading and onloading is done asynchronously using a CUDA stream. This can be useful for
overlapping computation and data transfer.
- **record_stream** (`bool`, defaults to `False`) -- When enabled with `use_stream`, it marks the current tensor
as having been used by this stream. It is faster at the expense of slightly more memory usage. Refer to the
[PyTorch official docs](https://pytorch.org/docs/stable/generated/torch.Tensor.record_stream.html) more
details.
- **low_cpu_mem_usage** (`bool`, defaults to `False`) --
If True, the CPU memory usage is minimized by pinning tensors on-the-fly instead of pre-pinning them. This
option only matters when using streamed CPU offloading (i.e. `use_stream=True`). This can be useful when
the CPU memory is a bottleneck but may counteract the benefits of using streams.</paramsdesc><paramgroups>0</paramgroups></docstring>
Applies group offloading to the internal layers of a torch.nn.Module. To understand what group offloading is, and
where it is beneficial, we need to first provide some context on how other supported offloading methods work.
Typically, offloading is done at two levels:
- Module-level: In Diffusers, this can be enabled using the `ModelMixin::enable_model_cpu_offload()` method. It
works by offloading each component of a pipeline to the CPU for storage, and onloading to the accelerator device
when needed for computation. This method is more memory-efficient than keeping all components on the accelerator,
but the memory requirements are still quite high. For this method to work, one needs memory equivalent to size of
the model in runtime dtype + size of largest intermediate activation tensors to be able to complete the forward
pass.
- Leaf-level: In Diffusers, this can be enabled using the `ModelMixin::enable_sequential_cpu_offload()` method. It
works by offloading the lowest leaf-level parameters of the computation graph to the CPU for storage, and
onloading only the leafs to the accelerator device for computation. This uses the lowest amount of accelerator
memory, but can be slower due to the excessive number of device synchronizations.
Group offloading is a middle ground between the two methods. It works by offloading groups of internal layers,
(either `torch.nn.ModuleList` or `torch.nn.Sequential`). This method uses lower memory than module-level
offloading. It is also faster than leaf-level/sequential offloading, as the number of device synchronizations is
reduced.
Another supported feature (for CUDA devices with support for asynchronous data transfer streams) is the ability to
overlap data transfer and computation to reduce the overall execution time compared to sequential offloading. This
is enabled using layer prefetching with streams, i.e., the layer that is to be executed next starts onloading to
the accelerator device while the current layer is being executed - this increases the memory requirements slightly.
Note that this implementation also supports leaf-level offloading but can be made much faster when using streams.
<ExampleCodeBlock anchor="diffusers.hooks.apply_group_offloading.example">
Example:
```python
>>> from diffusers import CogVideoXTransformer3DModel
>>> from diffusers.hooks import apply_group_offloading
>>> transformer = CogVideoXTransformer3DModel.from_pretrained(
... "THUDM/CogVideoX-5b", subfolder="transformer", torch_dtype=torch.bfloat16
... )
>>> apply_group_offloading(
... transformer,
... onload_device=torch.device("cuda"),
... offload_device=torch.device("cpu"),
... offload_type="block_level",
... num_blocks_per_group=2,
... use_stream=True,
... )
```
</ExampleCodeBlock>
</div>
<EditOnGithub source="https://github.com/huggingface/diffusers/blob/main/docs/source/en/api/utilities.md" />

Xet Storage Details

Size:
16.3 kB
·
Xet hash:
c5202983cdd941aa86e158e6a3cc942d148fc61309f5824225c5417c0d5f83a5

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.