Buckets:
Attention Processor
An attention processor is a class for applying different types of attention mechanisms.
AttnProcessor[[diffusers.models.attention_processor.AttnProcessor]]
class diffusers.models.attention_processor.AttnProcessordiffusers.models.attention_processor.AttnProcessor
Default processor for performing attention-related computations.
class diffusers.models.attention_processor.AttnProcessor2_0diffusers.models.attention_processor.AttnProcessor2_0
Processor for implementing scaled dot-product attention (enabled by default if you're using PyTorch 2.0).
class diffusers.models.attention_processor.AttnAddedKVProcessordiffusers.models.attention_processor.AttnAddedKVProcessor
Processor for performing attention-related computations with extra learnable key and value matrices for the text encoder.
class diffusers.models.attention_processor.AttnAddedKVProcessor2_0diffusers.models.attention_processor.AttnAddedKVProcessor2_0
Processor for performing scaled dot-product attention (enabled by default if you're using PyTorch 2.0), with extra learnable key and value matrices for the text encoder.
class diffusers.models.attention_processor.AttnProcessorNPUdiffusers.models.attention_processor.AttnProcessorNPU
Processor for implementing flash attention using torch_npu. Torch_npu supports only fp16 and bf16 data types. If fp32 is used, F.scaled_dot_product_attention will be used for computation, but the acceleration effect on NPU is not significant.
class diffusers.models.attention_processor.FusedAttnProcessor2_0diffusers.models.attention_processor.FusedAttnProcessor2_0
Processor for implementing scaled dot-product attention (enabled by default if you're using PyTorch 2.0). It uses fused projection layers. For self-attention modules, all projection matrices (i.e., query, key, value) are fused. For cross-attention modules, key and value projection matrices are fused.
> This API is currently 🧪 experimental in nature and can change in future.
Allegro[[diffusers.models.attention_processor.AllegroAttnProcessor2_0]]
class diffusers.models.attention_processor.AllegroAttnProcessor2_0diffusers.models.attention_processor.AllegroAttnProcessor2_0
Processor for implementing scaled dot-product attention (enabled by default if you're using PyTorch 2.0). This is used in the Allegro model. It applies a normalization layer and rotary embedding on the query and key vector.
AuraFlow[[diffusers.models.attention_processor.AuraFlowAttnProcessor2_0]]
class diffusers.models.attention_processor.AuraFlowAttnProcessor2_0diffusers.models.attention_processor.AuraFlowAttnProcessor2_0
class diffusers.models.attention_processor.FusedAuraFlowAttnProcessor2_0diffusers.models.attention_processor.FusedAuraFlowAttnProcessor2_0
CogVideoX[[diffusers.models.attention_processor.CogVideoXAttnProcessor2_0]]
class diffusers.models.attention_processor.CogVideoXAttnProcessor2_0diffusers.models.attention_processor.CogVideoXAttnProcessor2_0
Processor for implementing scaled dot-product attention for the CogVideoX model. It applies a rotary embedding on query and key vectors, but does not include spatial normalization.
class diffusers.models.attention_processor.FusedCogVideoXAttnProcessor2_0diffusers.models.attention_processor.FusedCogVideoXAttnProcessor2_0
Processor for implementing scaled dot-product attention for the CogVideoX model. It applies a rotary embedding on query and key vectors, but does not include spatial normalization.
CrossFrameAttnProcessor[[diffusers.pipelines.text_to_video_synthesis.pipeline_text_to_video_zero.CrossFrameAttnProcessor]]
class diffusers.pipelines.text_to_video_synthesis.pipeline_text_to_video_zero.CrossFrameAttnProcessordiffusers.pipelines.text_to_video_synthesis.pipeline_text_to_video_zero.CrossFrameAttnProcessor
Cross frame attention processor. Each frame attends the first frame.
Custom Diffusion[[diffusers.models.attention_processor.CustomDiffusionAttnProcessor]]
class diffusers.models.attention_processor.CustomDiffusionAttnProcessordiffusers.models.attention_processor.CustomDiffusionAttnProcessorbool, defaults to True) --
Whether to newly train the key and value matrices corresponding to the text features.
- train_q_out (
bool, defaults toTrue) -- Whether to newly train query matrices corresponding to the latent image features. - hidden_size (
int, optional, defaults toNone) -- The hidden size of the attention layer. - cross_attention_dim (
int, optional, defaults toNone) -- The number of channels in theencoder_hidden_states. - out_bias (
bool, defaults toTrue) -- Whether to include the bias parameter intrain_q_out. - dropout (
float, optional, defaults to 0.0) -- The dropout probability to use.0
Processor for implementing attention for the Custom Diffusion method.
class diffusers.models.attention_processor.CustomDiffusionAttnProcessor2_0diffusers.models.attention_processor.CustomDiffusionAttnProcessor2_0bool, defaults to True) --
Whether to newly train the key and value matrices corresponding to the text features.
- train_q_out (
bool, defaults toTrue) -- Whether to newly train query matrices corresponding to the latent image features. - hidden_size (
int, optional, defaults toNone) -- The hidden size of the attention layer. - cross_attention_dim (
int, optional, defaults toNone) -- The number of channels in theencoder_hidden_states. - out_bias (
bool, defaults toTrue) -- Whether to include the bias parameter intrain_q_out. - dropout (
float, optional, defaults to 0.0) -- The dropout probability to use.0
Processor for implementing attention for the Custom Diffusion method using PyTorch 2.0’s memory-efficient scaled dot-product attention.
class diffusers.models.attention_processor.CustomDiffusionXFormersAttnProcessordiffusers.models.attention_processor.CustomDiffusionXFormersAttnProcessorbool, defaults to True) --
Whether to newly train the key and value matrices corresponding to the text features.
- train_q_out (
bool, defaults toTrue) -- Whether to newly train query matrices corresponding to the latent image features. - hidden_size (
int, optional, defaults toNone) -- The hidden size of the attention layer. - cross_attention_dim (
int, optional, defaults toNone) -- The number of channels in theencoder_hidden_states. - out_bias (
bool, defaults toTrue) -- Whether to include the bias parameter intrain_q_out. - dropout (
float, optional, defaults to 0.0) -- The dropout probability to use. - attention_op (
Callable, optional, defaults toNone) -- The base operator to use as the attention operator. It is recommended to set toNone, and allow xFormers to choose the best operator.0
Processor for implementing memory efficient attention using xFormers for the Custom Diffusion method.
Flux[[diffusers.models.attention_processor.FluxAttnProcessor2_0]]
class diffusers.models.attention_processor.FluxAttnProcessor2_0diffusers.models.attention_processor.FluxAttnProcessor2_0
class diffusers.models.attention_processor.FusedFluxAttnProcessor2_0diffusers.models.attention_processor.FusedFluxAttnProcessor2_0
class diffusers.models.attention_processor.FluxSingleAttnProcessor2_0diffusers.models.attention_processor.FluxSingleAttnProcessor2_0
Processor for implementing scaled dot-product attention (enabled by default if you're using PyTorch 2.0).
Hunyuan[[diffusers.models.attention_processor.HunyuanAttnProcessor2_0]]
class diffusers.models.attention_processor.HunyuanAttnProcessor2_0diffusers.models.attention_processor.HunyuanAttnProcessor2_0
Processor for implementing scaled dot-product attention (enabled by default if you're using PyTorch 2.0). This is used in the HunyuanDiT model. It applies a s normalization layer and rotary embedding on query and key vector.
class diffusers.models.attention_processor.FusedHunyuanAttnProcessor2_0diffusers.models.attention_processor.FusedHunyuanAttnProcessor2_0
Processor for implementing scaled dot-product attention (enabled by default if you're using PyTorch 2.0) with fused projection layers. This is used in the HunyuanDiT model. It applies a s normalization layer and rotary embedding on query and key vector.
class diffusers.models.attention_processor.PAGHunyuanAttnProcessor2_0diffusers.models.attention_processor.PAGHunyuanAttnProcessor2_0
Processor for implementing scaled dot-product attention (enabled by default if you're using PyTorch 2.0). This is used in the HunyuanDiT model. It applies a normalization layer and rotary embedding on query and key vector. This variant of the processor employs Pertubed Attention Guidance.
class diffusers.models.attention_processor.PAGCFGHunyuanAttnProcessor2_0diffusers.models.attention_processor.PAGCFGHunyuanAttnProcessor2_0
Processor for implementing scaled dot-product attention (enabled by default if you're using PyTorch 2.0). This is used in the HunyuanDiT model. It applies a normalization layer and rotary embedding on query and key vector. This variant of the processor employs Pertubed Attention Guidance.
IdentitySelfAttnProcessor2_0[[diffusers.models.attention_processor.PAGIdentitySelfAttnProcessor2_0]]
class diffusers.models.attention_processor.PAGIdentitySelfAttnProcessor2_0diffusers.models.attention_processor.PAGIdentitySelfAttnProcessor2_0
Processor for implementing PAG using scaled dot-product attention (enabled by default if you're using PyTorch 2.0). PAG reference: https://huggingface.co/papers/2403.17377
class diffusers.models.attention_processor.PAGCFGIdentitySelfAttnProcessor2_0diffusers.models.attention_processor.PAGCFGIdentitySelfAttnProcessor2_0
Processor for implementing PAG using scaled dot-product attention (enabled by default if you're using PyTorch 2.0). PAG reference: https://huggingface.co/papers/2403.17377
IP-Adapter[[diffusers.models.attention_processor.IPAdapterAttnProcessor]]
class diffusers.models.attention_processor.IPAdapterAttnProcessordiffusers.models.attention_processor.IPAdapterAttnProcessorint) --
The hidden size of the attention layer.
- cross_attention_dim (
int) -- The number of channels in theencoder_hidden_states. - num_tokens (
int,Tuple[int]orList[int], defaults to(4,)) -- The context length of the image features. - scale (
floator Listfloat, defaults to 1.0) -- the weight scale of image prompt.0
Attention processor for Multiple IP-Adapters.
class diffusers.models.attention_processor.IPAdapterAttnProcessor2_0diffusers.models.attention_processor.IPAdapterAttnProcessor2_0int) --
The hidden size of the attention layer.
- cross_attention_dim (
int) -- The number of channels in theencoder_hidden_states. - num_tokens (
int,Tuple[int]orList[int], defaults to(4,)) -- The context length of the image features. - scale (
floatorList[float], defaults to 1.0) -- the weight scale of image prompt.0
Attention processor for IP-Adapter for PyTorch 2.0.
class diffusers.models.attention_processor.SD3IPAdapterJointAttnProcessor2_0diffusers.models.attention_processor.SD3IPAdapterJointAttnProcessor2_0int) --
The number of hidden channels.
- ip_hidden_states_dim (
int) -- The image feature dimension. - head_dim (
int) -- The number of head channels. - timesteps_emb_dim (
int, defaults to 1280) -- The number of input channels for timestep embedding. - scale (
float, defaults to 0.5) -- IP-Adapter scale.0
Attention processor for IP-Adapter used typically in processing the SD3-like self-attention projections, with additional image-based information and timestep embeddings.
JointAttnProcessor2_0[[diffusers.models.attention_processor.JointAttnProcessor2_0]]
class diffusers.models.attention_processor.JointAttnProcessor2_0diffusers.models.attention_processor.JointAttnProcessor2_0
class diffusers.models.attention_processor.PAGJointAttnProcessor2_0diffusers.models.attention_processor.PAGJointAttnProcessor2_0
class diffusers.models.attention_processor.PAGCFGJointAttnProcessor2_0diffusers.models.attention_processor.PAGCFGJointAttnProcessor2_0
class diffusers.models.attention_processor.FusedJointAttnProcessor2_0diffusers.models.attention_processor.FusedJointAttnProcessor2_0
LoRA[[diffusers.models.attention_processor.LoRAAttnProcessor]]
class diffusers.models.attention_processor.LoRAAttnProcessordiffusers.models.attention_processor.LoRAAttnProcessor
Processor for implementing attention with LoRA.
class diffusers.models.attention_processor.LoRAAttnProcessor2_0diffusers.models.attention_processor.LoRAAttnProcessor2_0
Processor for implementing attention with LoRA (enabled by default if you're using PyTorch 2.0).
class diffusers.models.attention_processor.LoRAAttnAddedKVProcessordiffusers.models.attention_processor.LoRAAttnAddedKVProcessor
Processor for implementing attention with LoRA with extra learnable key and value matrices for the text encoder.
class diffusers.models.attention_processor.LoRAXFormersAttnProcessordiffusers.models.attention_processor.LoRAXFormersAttnProcessor
Processor for implementing attention with LoRA using xFormers.
Lumina-T2X[[diffusers.models.attention_processor.LuminaAttnProcessor2_0]]
class diffusers.models.attention_processor.LuminaAttnProcessor2_0diffusers.models.attention_processor.LuminaAttnProcessor2_0
Processor for implementing scaled dot-product attention (enabled by default if you're using PyTorch 2.0). This is used in the LuminaNextDiT model. It applies a s normalization layer and rotary embedding on query and key vector.
Mochi[[diffusers.models.attention_processor.MochiAttnProcessor2_0]]
class diffusers.models.attention_processor.MochiAttnProcessor2_0diffusers.models.attention_processor.MochiAttnProcessor2_0
class diffusers.models.attention_processor.MochiVaeAttnProcessor2_0diffusers.models.attention_processor.MochiVaeAttnProcessor2_0
Attention processor used in Mochi VAE.
Sana[[diffusers.models.attention_processor.SanaLinearAttnProcessor2_0]]
class diffusers.models.attention_processor.SanaLinearAttnProcessor2_0diffusers.models.attention_processor.SanaLinearAttnProcessor2_0
Processor for implementing scaled dot-product linear attention.
class diffusers.models.attention_processor.SanaMultiscaleAttnProcessor2_0diffusers.models.attention_processor.SanaMultiscaleAttnProcessor2_0
Processor for implementing multiscale quadratic attention.
class diffusers.models.attention_processor.PAGCFGSanaLinearAttnProcessor2_0diffusers.models.attention_processor.PAGCFGSanaLinearAttnProcessor2_0
Processor for implementing scaled dot-product linear attention.
class diffusers.models.attention_processor.PAGIdentitySanaLinearAttnProcessor2_0diffusers.models.attention_processor.PAGIdentitySanaLinearAttnProcessor2_0
Processor for implementing scaled dot-product linear attention.
Stable Audio[[diffusers.models.attention_processor.StableAudioAttnProcessor2_0]]
class diffusers.models.attention_processor.StableAudioAttnProcessor2_0diffusers.models.attention_processor.StableAudioAttnProcessor2_0
Processor for implementing scaled dot-product attention (enabled by default if you're using PyTorch 2.0). This is used in the Stable Audio model. It applies rotary embedding on query and key vector, and allows MHA, GQA or MQA.
SlicedAttnProcessor[[diffusers.models.attention_processor.SlicedAttnProcessor]]
class diffusers.models.attention_processor.SlicedAttnProcessordiffusers.models.attention_processor.SlicedAttnProcessorint, optional) --
The number of steps to compute attention. Uses as many slices as attention_head_dim // slice_size, and
attention_head_dim must be a multiple of the slice_size.0
Processor for implementing sliced attention.
class diffusers.models.attention_processor.SlicedAttnAddedKVProcessordiffusers.models.attention_processor.SlicedAttnAddedKVProcessorint, optional) --
The number of steps to compute attention. Uses as many slices as attention_head_dim // slice_size, and
attention_head_dim must be a multiple of the slice_size.0
Processor for implementing sliced attention with extra learnable key and value matrices for the text encoder.
XFormersAttnProcessor[[diffusers.models.attention_processor.XFormersAttnProcessor]]
class diffusers.models.attention_processor.XFormersAttnProcessordiffusers.models.attention_processor.XFormersAttnProcessorCallable, optional, defaults to None) --
The base
operator to
use as the attention operator. It is recommended to set to None, and allow xFormers to choose the best
operator.0
Processor for implementing memory efficient attention using xFormers.
class diffusers.models.attention_processor.XFormersAttnAddedKVProcessordiffusers.models.attention_processor.XFormersAttnAddedKVProcessorCallable, optional, defaults to None) --
The base
operator to
use as the attention operator. It is recommended to set to None, and allow xFormers to choose the best
operator.0
Processor for implementing memory efficient attention using xFormers.
XLAFlashAttnProcessor2_0[[diffusers.models.attention_processor.XLAFlashAttnProcessor2_0]]
class diffusers.models.attention_processor.XLAFlashAttnProcessor2_0diffusers.models.attention_processor.XLAFlashAttnProcessor2_0
Processor for implementing scaled dot-product attention with pallas flash attention kernel if using torch_xla.
XFormersJointAttnProcessor[[diffusers.models.attention_processor.XFormersJointAttnProcessor]]
class diffusers.models.attention_processor.XFormersJointAttnProcessordiffusers.models.attention_processor.XFormersJointAttnProcessorCallable, optional, defaults to None) --
The base
operator to
use as the attention operator. It is recommended to set to None, and allow xFormers to choose the best
operator.0
Processor for implementing memory efficient attention using xFormers.
IPAdapterXFormersAttnProcessor[[diffusers.models.attention_processor.IPAdapterXFormersAttnProcessor]]
class diffusers.models.attention_processor.IPAdapterXFormersAttnProcessordiffusers.models.attention_processor.IPAdapterXFormersAttnProcessorint) --
The hidden size of the attention layer.
- cross_attention_dim (
int) -- The number of channels in theencoder_hidden_states. - num_tokens (
int,Tuple[int]orList[int], defaults to(4,)) -- The context length of the image features. - scale (
floatorList[float], defaults to 1.0) -- the weight scale of image prompt. - attention_op (
Callable, optional, defaults toNone) -- The base operator to use as the attention operator. It is recommended to set toNone, and allow xFormers to choose the best operator.0
Attention processor for IP-Adapter using xFormers.
FluxIPAdapterJointAttnProcessor2_0[[diffusers.models.attention_processor.FluxIPAdapterJointAttnProcessor2_0]]
class diffusers.models.attention_processor.FluxIPAdapterJointAttnProcessor2_0diffusers.models.attention_processor.FluxIPAdapterJointAttnProcessor2_0
XLAFluxFlashAttnProcessor2_0[[diffusers.models.attention_processor.XLAFluxFlashAttnProcessor2_0]]
class diffusers.models.attention_processor.XLAFluxFlashAttnProcessor2_0diffusers.models.attention_processor.XLAFluxFlashAttnProcessor2_0
Processor for implementing scaled dot-product attention with pallas flash attention kernel if using torch_xla.
Xet Storage Details
- Size:
- 40.6 kB
- Xet hash:
- f26ba147605b376d118817133c4fa70aa1522d62456b342080407106287fc02c
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.