Buckets:
| # Caching methods | |
| Cache methods speedup diffusion transformers by storing and reusing intermediate outputs of specific layers, such as attention and feedforward layers, instead of recalculating them at each inference step. | |
| ## CacheMixin[[diffusers.CacheMixin]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.CacheMixin</name><anchor>diffusers.CacheMixin</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/models/cache_utils.py#L23</source><parameters>[]</parameters></docstring> | |
| A class for enable/disabling caching techniques on diffusion models. | |
| Supported caching techniques: | |
| - [Pyramid Attention Broadcast](https://huggingface.co/papers/2408.12588) | |
| - [FasterCache](https://huggingface.co/papers/2410.19355) | |
| - [FirstBlockCache](https://github.com/chengzeyi/ParaAttention/blob/7a266123671b55e7e5a2fe9af3121f07a36afc78/README.md#first-block-cache-our-dynamic-caching) | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>cache_context</name><anchor>diffusers.CacheMixin.cache_context</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/models/cache_utils.py#L120</source><parameters>[{"name": "name", "val": ": str"}]</parameters></docstring> | |
| Context manager that provides additional methods for cache management. | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>enable_cache</name><anchor>diffusers.CacheMixin.enable_cache</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/models/cache_utils.py#L39</source><parameters>[{"name": "config", "val": ""}]</parameters><paramsdesc>- **config** (`Union[PyramidAttentionBroadcastConfig]`) -- | |
| The configuration for applying the caching technique. Currently supported caching techniques are: | |
| - [PyramidAttentionBroadcastConfig](/docs/diffusers/pr_12595/en/api/cache#diffusers.PyramidAttentionBroadcastConfig)</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Enable caching techniques on the model. | |
| <ExampleCodeBlock anchor="diffusers.CacheMixin.enable_cache.example"> | |
| Example: | |
| ```python | |
| >>> import torch | |
| >>> from diffusers import CogVideoXPipeline, PyramidAttentionBroadcastConfig | |
| >>> pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16) | |
| >>> pipe.to("cuda") | |
| >>> config = PyramidAttentionBroadcastConfig( | |
| ... spatial_attention_block_skip_range=2, | |
| ... spatial_attention_timestep_skip_range=(100, 800), | |
| ... current_timestep_callback=lambda: pipe.current_timestep, | |
| ... ) | |
| >>> pipe.transformer.enable_cache(config) | |
| ``` | |
| </ExampleCodeBlock> | |
| </div></div> | |
| ## PyramidAttentionBroadcastConfig[[diffusers.PyramidAttentionBroadcastConfig]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.PyramidAttentionBroadcastConfig</name><anchor>diffusers.PyramidAttentionBroadcastConfig</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/hooks/pyramid_attention_broadcast.py#L40</source><parameters>[{"name": "spatial_attention_block_skip_range", "val": ": typing.Optional[int] = None"}, {"name": "temporal_attention_block_skip_range", "val": ": typing.Optional[int] = None"}, {"name": "cross_attention_block_skip_range", "val": ": typing.Optional[int] = None"}, {"name": "spatial_attention_timestep_skip_range", "val": ": typing.Tuple[int, int] = (100, 800)"}, {"name": "temporal_attention_timestep_skip_range", "val": ": typing.Tuple[int, int] = (100, 800)"}, {"name": "cross_attention_timestep_skip_range", "val": ": typing.Tuple[int, int] = (100, 800)"}, {"name": "spatial_attention_block_identifiers", "val": ": typing.Tuple[str, ...] = ('blocks', 'transformer_blocks', 'single_transformer_blocks', 'layers')"}, {"name": "temporal_attention_block_identifiers", "val": ": typing.Tuple[str, ...] = ('temporal_transformer_blocks',)"}, {"name": "cross_attention_block_identifiers", "val": ": typing.Tuple[str, ...] = ('blocks', 'transformer_blocks', 'layers')"}, {"name": "current_timestep_callback", "val": ": typing.Callable[[], int] = None"}]</parameters><paramsdesc>- **spatial_attention_block_skip_range** (`int`, *optional*, defaults to `None`) -- | |
| The number of times a specific spatial attention broadcast is skipped before computing the attention states | |
| to re-use. If this is set to the value `N`, the attention computation will be skipped `N - 1` times (i.e., | |
| old attention states will be reused) before computing the new attention states again. | |
| - **temporal_attention_block_skip_range** (`int`, *optional*, defaults to `None`) -- | |
| The number of times a specific temporal attention broadcast is skipped before computing the attention | |
| states to re-use. If this is set to the value `N`, the attention computation will be skipped `N - 1` times | |
| (i.e., old attention states will be reused) before computing the new attention states again. | |
| - **cross_attention_block_skip_range** (`int`, *optional*, defaults to `None`) -- | |
| The number of times a specific cross-attention broadcast is skipped before computing the attention states | |
| to re-use. If this is set to the value `N`, the attention computation will be skipped `N - 1` times (i.e., | |
| old attention states will be reused) before computing the new attention states again. | |
| - **spatial_attention_timestep_skip_range** (`Tuple[int, int]`, defaults to `(100, 800)`) -- | |
| The range of timesteps to skip in the spatial attention layer. The attention computations will be | |
| conditionally skipped if the current timestep is within the specified range. | |
| - **temporal_attention_timestep_skip_range** (`Tuple[int, int]`, defaults to `(100, 800)`) -- | |
| The range of timesteps to skip in the temporal attention layer. The attention computations will be | |
| conditionally skipped if the current timestep is within the specified range. | |
| - **cross_attention_timestep_skip_range** (`Tuple[int, int]`, defaults to `(100, 800)`) -- | |
| The range of timesteps to skip in the cross-attention layer. The attention computations will be | |
| conditionally skipped if the current timestep is within the specified range. | |
| - **spatial_attention_block_identifiers** (`Tuple[str, ...]`) -- | |
| The identifiers to match against the layer names to determine if the layer is a spatial attention layer. | |
| - **temporal_attention_block_identifiers** (`Tuple[str, ...]`) -- | |
| The identifiers to match against the layer names to determine if the layer is a temporal attention layer. | |
| - **cross_attention_block_identifiers** (`Tuple[str, ...]`) -- | |
| The identifiers to match against the layer names to determine if the layer is a cross-attention layer.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Configuration for Pyramid Attention Broadcast. | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>diffusers.apply_pyramid_attention_broadcast</name><anchor>diffusers.apply_pyramid_attention_broadcast</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/hooks/pyramid_attention_broadcast.py#L181</source><parameters>[{"name": "module", "val": ": Module"}, {"name": "config", "val": ": PyramidAttentionBroadcastConfig"}]</parameters><paramsdesc>- **module** (`torch.nn.Module`) -- | |
| The module to apply Pyramid Attention Broadcast to. | |
| - **config** (`Optional[PyramidAttentionBroadcastConfig]`, `optional`, defaults to `None`) -- | |
| The configuration to use for Pyramid Attention Broadcast.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Apply [Pyramid Attention Broadcast](https://huggingface.co/papers/2408.12588) to a given pipeline. | |
| PAB is an attention approximation method that leverages the similarity in attention states between timesteps to | |
| reduce the computational cost of attention computation. The key takeaway from the paper is that the attention | |
| similarity in the cross-attention layers between timesteps is high, followed by less similarity in the temporal and | |
| spatial layers. This allows for the skipping of attention computation in the cross-attention layers more frequently | |
| than in the temporal and spatial layers. Applying PAB will, therefore, speedup the inference process. | |
| <ExampleCodeBlock anchor="diffusers.apply_pyramid_attention_broadcast.example"> | |
| Example: | |
| ```python | |
| >>> import torch | |
| >>> from diffusers import CogVideoXPipeline, PyramidAttentionBroadcastConfig, apply_pyramid_attention_broadcast | |
| >>> from diffusers.utils import export_to_video | |
| >>> pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16) | |
| >>> pipe.to("cuda") | |
| >>> config = PyramidAttentionBroadcastConfig( | |
| ... spatial_attention_block_skip_range=2, | |
| ... spatial_attention_timestep_skip_range=(100, 800), | |
| ... current_timestep_callback=lambda: pipe.current_timestep, | |
| ... ) | |
| >>> apply_pyramid_attention_broadcast(pipe.transformer, config) | |
| ``` | |
| </ExampleCodeBlock> | |
| </div> | |
| ## FasterCacheConfig[[diffusers.FasterCacheConfig]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.FasterCacheConfig</name><anchor>diffusers.FasterCacheConfig</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/hooks/faster_cache.py#L50</source><parameters>[{"name": "spatial_attention_block_skip_range", "val": ": int = 2"}, {"name": "temporal_attention_block_skip_range", "val": ": typing.Optional[int] = None"}, {"name": "spatial_attention_timestep_skip_range", "val": ": typing.Tuple[int, int] = (-1, 681)"}, {"name": "temporal_attention_timestep_skip_range", "val": ": typing.Tuple[int, int] = (-1, 681)"}, {"name": "low_frequency_weight_update_timestep_range", "val": ": typing.Tuple[int, int] = (99, 901)"}, {"name": "high_frequency_weight_update_timestep_range", "val": ": typing.Tuple[int, int] = (-1, 301)"}, {"name": "alpha_low_frequency", "val": ": float = 1.1"}, {"name": "alpha_high_frequency", "val": ": float = 1.1"}, {"name": "unconditional_batch_skip_range", "val": ": int = 5"}, {"name": "unconditional_batch_timestep_skip_range", "val": ": typing.Tuple[int, int] = (-1, 641)"}, {"name": "spatial_attention_block_identifiers", "val": ": typing.Tuple[str, ...] = ('^blocks.*attn', '^transformer_blocks.*attn', '^single_transformer_blocks.*attn')"}, {"name": "temporal_attention_block_identifiers", "val": ": typing.Tuple[str, ...] = ('^temporal_transformer_blocks.*attn',)"}, {"name": "attention_weight_callback", "val": ": typing.Callable[[torch.nn.modules.module.Module], float] = None"}, {"name": "low_frequency_weight_callback", "val": ": typing.Callable[[torch.nn.modules.module.Module], float] = None"}, {"name": "high_frequency_weight_callback", "val": ": typing.Callable[[torch.nn.modules.module.Module], float] = None"}, {"name": "tensor_format", "val": ": str = 'BCFHW'"}, {"name": "is_guidance_distilled", "val": ": bool = False"}, {"name": "current_timestep_callback", "val": ": typing.Callable[[], int] = None"}, {"name": "_unconditional_conditional_input_kwargs_identifiers", "val": ": typing.List[str] = ('hidden_states', 'encoder_hidden_states', 'timestep', 'attention_mask', 'encoder_attention_mask')"}]</parameters><paramsdesc>- **spatial_attention_block_skip_range** (`int`, defaults to `2`) -- | |
| Calculate the attention states every `N` iterations. If this is set to `N`, the attention computation will | |
| be skipped `N - 1` times (i.e., cached attention states will be reused) before computing the new attention | |
| states again. | |
| - **temporal_attention_block_skip_range** (`int`, *optional*, defaults to `None`) -- | |
| Calculate the attention states every `N` iterations. If this is set to `N`, the attention computation will | |
| be skipped `N - 1` times (i.e., cached attention states will be reused) before computing the new attention | |
| states again. | |
| - **spatial_attention_timestep_skip_range** (`Tuple[float, float]`, defaults to `(-1, 681)`) -- | |
| The timestep range within which the spatial attention computation can be skipped without a significant loss | |
| in quality. This is to be determined by the user based on the underlying model. The first value in the | |
| tuple is the lower bound and the second value is the upper bound. Typically, diffusion timesteps for | |
| denoising are in the reversed range of 0 to 1000 (i.e. denoising starts at timestep 1000 and ends at | |
| timestep 0). For the default values, this would mean that the spatial attention computation skipping will | |
| be applicable only after denoising timestep 681 is reached, and continue until the end of the denoising | |
| process. | |
| - **temporal_attention_timestep_skip_range** (`Tuple[float, float]`, *optional*, defaults to `None`) -- | |
| The timestep range within which the temporal attention computation can be skipped without a significant | |
| loss in quality. This is to be determined by the user based on the underlying model. The first value in the | |
| tuple is the lower bound and the second value is the upper bound. Typically, diffusion timesteps for | |
| denoising are in the reversed range of 0 to 1000 (i.e. denoising starts at timestep 1000 and ends at | |
| timestep 0). | |
| - **low_frequency_weight_update_timestep_range** (`Tuple[int, int]`, defaults to `(99, 901)`) -- | |
| The timestep range within which the low frequency weight scaling update is applied. The first value in the | |
| tuple is the lower bound and the second value is the upper bound of the timestep range. The callback | |
| function for the update is called only within this range. | |
| - **high_frequency_weight_update_timestep_range** (`Tuple[int, int]`, defaults to `(-1, 301)`) -- | |
| The timestep range within which the high frequency weight scaling update is applied. The first value in the | |
| tuple is the lower bound and the second value is the upper bound of the timestep range. The callback | |
| function for the update is called only within this range. | |
| - **alpha_low_frequency** (`float`, defaults to `1.1`) -- | |
| The weight to scale the low frequency updates by. This is used to approximate the unconditional branch from | |
| the conditional branch outputs. | |
| - **alpha_high_frequency** (`float`, defaults to `1.1`) -- | |
| The weight to scale the high frequency updates by. This is used to approximate the unconditional branch | |
| from the conditional branch outputs. | |
| - **unconditional_batch_skip_range** (`int`, defaults to `5`) -- | |
| Process the unconditional branch every `N` iterations. If this is set to `N`, the unconditional branch | |
| computation will be skipped `N - 1` times (i.e., cached unconditional branch states will be reused) before | |
| computing the new unconditional branch states again. | |
| - **unconditional_batch_timestep_skip_range** (`Tuple[float, float]`, defaults to `(-1, 641)`) -- | |
| The timestep range within which the unconditional branch computation can be skipped without a significant | |
| loss in quality. This is to be determined by the user based on the underlying model. The first value in the | |
| tuple is the lower bound and the second value is the upper bound. | |
| - **spatial_attention_block_identifiers** (`Tuple[str, ...]`, defaults to `("blocks.*attn1", "transformer_blocks.*attn1", "single_transformer_blocks.*attn1")`) -- | |
| The identifiers to match the spatial attention blocks in the model. If the name of the block contains any | |
| of these identifiers, FasterCache will be applied to that block. This can either be the full layer names, | |
| partial layer names, or regex patterns. Matching will always be done using a regex match. | |
| - **temporal_attention_block_identifiers** (`Tuple[str, ...]`, defaults to `("temporal_transformer_blocks.*attn1",)`) -- | |
| The identifiers to match the temporal attention blocks in the model. If the name of the block contains any | |
| of these identifiers, FasterCache will be applied to that block. This can either be the full layer names, | |
| partial layer names, or regex patterns. Matching will always be done using a regex match. | |
| - **attention_weight_callback** (`Callable[[torch.nn.Module], float]`, defaults to `None`) -- | |
| The callback function to determine the weight to scale the attention outputs by. This function should take | |
| the attention module as input and return a float value. This is used to approximate the unconditional | |
| branch from the conditional branch outputs. If not provided, the default weight is 0.5 for all timesteps. | |
| Typically, as described in the paper, this weight should gradually increase from 0 to 1 as the inference | |
| progresses. Users are encouraged to experiment and provide custom weight schedules that take into account | |
| the number of inference steps and underlying model behaviour as denoising progresses. | |
| - **low_frequency_weight_callback** (`Callable[[torch.nn.Module], float]`, defaults to `None`) -- | |
| The callback function to determine the weight to scale the low frequency updates by. If not provided, the | |
| default weight is 1.1 for timesteps within the range specified (as described in the paper). | |
| - **high_frequency_weight_callback** (`Callable[[torch.nn.Module], float]`, defaults to `None`) -- | |
| The callback function to determine the weight to scale the high frequency updates by. If not provided, the | |
| default weight is 1.1 for timesteps within the range specified (as described in the paper). | |
| - **tensor_format** (`str`, defaults to `"BCFHW"`) -- | |
| The format of the input tensors. This should be one of `"BCFHW"`, `"BFCHW"`, or `"BCHW"`. The format is | |
| used to split individual latent frames in order for low and high frequency components to be computed. | |
| - **is_guidance_distilled** (`bool`, defaults to `False`) -- | |
| Whether the model is guidance distilled or not. If the model is guidance distilled, FasterCache will not be | |
| applied at the denoiser-level to skip the unconditional branch computation (as there is none). | |
| - **_unconditional_conditional_input_kwargs_identifiers** (`List[str]`, defaults to `("hidden_states", "encoder_hidden_states", "timestep", "attention_mask", "encoder_attention_mask")`) -- | |
| The identifiers to match the input kwargs that contain the batchwise-concatenated unconditional and | |
| conditional inputs. If the name of the input kwargs contains any of these identifiers, FasterCache will | |
| split the inputs into unconditional and conditional branches. This must be a list of exact input kwargs | |
| names that contain the batchwise-concatenated unconditional and conditional inputs.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Configuration for [FasterCache](https://huggingface.co/papers/2410.19355). | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>diffusers.apply_faster_cache</name><anchor>diffusers.apply_faster_cache</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/hooks/faster_cache.py#L486</source><parameters>[{"name": "module", "val": ": Module"}, {"name": "config", "val": ": FasterCacheConfig"}]</parameters><paramsdesc>- **module** (`torch.nn.Module`) -- | |
| The pytorch module to apply FasterCache to. Typically, this should be a transformer architecture supported | |
| in Diffusers, such as `CogVideoXTransformer3DModel`, but external implementations may also work. | |
| - **config** (`FasterCacheConfig`) -- | |
| The configuration to use for FasterCache.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Applies [FasterCache](https://huggingface.co/papers/2410.19355) to a given pipeline. | |
| <ExampleCodeBlock anchor="diffusers.apply_faster_cache.example"> | |
| Example: | |
| ```python | |
| >>> import torch | |
| >>> from diffusers import CogVideoXPipeline, FasterCacheConfig, apply_faster_cache | |
| >>> pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16) | |
| >>> pipe.to("cuda") | |
| >>> config = FasterCacheConfig( | |
| ... spatial_attention_block_skip_range=2, | |
| ... spatial_attention_timestep_skip_range=(-1, 681), | |
| ... low_frequency_weight_update_timestep_range=(99, 641), | |
| ... high_frequency_weight_update_timestep_range=(-1, 301), | |
| ... spatial_attention_block_identifiers=["transformer_blocks"], | |
| ... attention_weight_callback=lambda _: 0.3, | |
| ... tensor_format="BFCHW", | |
| ... ) | |
| >>> apply_faster_cache(pipe.transformer, config) | |
| ``` | |
| </ExampleCodeBlock> | |
| </div> | |
| ### FirstBlockCacheConfig[[diffusers.FirstBlockCacheConfig]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.FirstBlockCacheConfig</name><anchor>diffusers.FirstBlockCacheConfig</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/hooks/first_block_cache.py#L34</source><parameters>[{"name": "threshold", "val": ": float = 0.05"}]</parameters><paramsdesc>- **threshold** (`float`, defaults to `0.05`) -- | |
| The threshold to determine whether or not a forward pass through all layers of the model is required. A | |
| higher threshold usually results in a forward pass through a lower number of layers and faster inference, | |
| but might lead to poorer generation quality. A lower threshold may not result in significant generation | |
| speedup. The threshold is compared against the absmean difference of the residuals between the current and | |
| cached outputs from the first transformer block. If the difference is below the threshold, the forward pass | |
| is skipped.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Configuration for [First Block | |
| Cache](https://github.com/chengzeyi/ParaAttention/blob/7a266123671b55e7e5a2fe9af3121f07a36afc78/README.md#first-block-cache-our-dynamic-caching). | |
| </div> | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>diffusers.apply_first_block_cache</name><anchor>diffusers.apply_first_block_cache</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/hooks/first_block_cache.py#L194</source><parameters>[{"name": "module", "val": ": Module"}, {"name": "config", "val": ": FirstBlockCacheConfig"}]</parameters><paramsdesc>- **module** (`torch.nn.Module`) -- | |
| The pytorch module to apply FBCache to. Typically, this should be a transformer architecture supported in | |
| Diffusers, such as `CogVideoXTransformer3DModel`, but external implementations may also work. | |
| - **config** (`FirstBlockCacheConfig`) -- | |
| The configuration to use for applying the FBCache method.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Applies [First Block | |
| Cache](https://github.com/chengzeyi/ParaAttention/blob/4de137c5b96416489f06e43e19f2c14a772e28fd/README.md#first-block-cache-our-dynamic-caching) | |
| to a given module. | |
| First Block Cache builds on the ideas of [TeaCache](https://huggingface.co/papers/2411.19108). It is much simpler | |
| to implement generically for a wide range of models and has been integrated first for experimental purposes. | |
| <ExampleCodeBlock anchor="diffusers.apply_first_block_cache.example"> | |
| Example: | |
| ```python | |
| >>> import torch | |
| >>> from diffusers import CogView4Pipeline | |
| >>> from diffusers.hooks import apply_first_block_cache, FirstBlockCacheConfig | |
| >>> pipe = CogView4Pipeline.from_pretrained("THUDM/CogView4-6B", torch_dtype=torch.bfloat16) | |
| >>> pipe.to("cuda") | |
| >>> apply_first_block_cache(pipe.transformer, FirstBlockCacheConfig(threshold=0.2)) | |
| >>> prompt = "A photo of an astronaut riding a horse on mars" | |
| >>> image = pipe(prompt, generator=torch.Generator().manual_seed(42)).images[0] | |
| >>> image.save("output.png") | |
| ``` | |
| </ExampleCodeBlock> | |
| </div> | |
| <EditOnGithub source="https://github.com/huggingface/diffusers/blob/main/docs/source/en/api/cache.md" /> |
Xet Storage Details
- Size:
- 23.6 kB
- Xet hash:
- 4809c8df04624d95f8bd36be4cbf87ca8491b48e782d291f29ee00121e6cf052
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.