Parallelism strategies help speed up diffusion transformers by distributing computations across multiple devices, allowing for faster inference/training times. Refer to the Distributed inferece guide to learn more.
class diffusers.ContextParallelConfigdiffusers.ContextParallelConfighttps://github.com/huggingface/diffusers/blob/vr_12595/src/diffusers/models/_modeling_parallel.py#L41[{"name": "ring_degree", "val": ": typing.Optional[int] = None"}, {"name": "ulysses_degree", "val": ": typing.Optional[int] = None"}, {"name": "convert_to_fp32", "val": ": bool = True"}, {"name": "rotate_method", "val": ": typing.Literal['allgather', 'alltoall'] = 'allgather'"}, {"name": "_rank", "val": ": int = None"}, {"name": "_world_size", "val": ": int = None"}, {"name": "_device", "val": ": device = None"}, {"name": "_mesh", "val": ": DeviceMesh = None"}, {"name": "_flattened_mesh", "val": ": DeviceMesh = None"}, {"name": "_ring_mesh", "val": ": DeviceMesh = None"}, {"name": "_ulysses_mesh", "val": ": DeviceMesh = None"}, {"name": "_ring_local_rank", "val": ": int = None"}, {"name": "_ulysses_local_rank", "val": ": int = None"}]- ring_degree (int, optional, defaults to 1) --
Number of devices to use for ring attention within a context parallel region. Must be a divisor of the
total number of devices in the context parallel mesh.
ulysses_degree (int, optional, defaults to 1) --
Number of devices to use for ulysses attention within a context parallel region. Must be a divisor of the
total number of devices in the context parallel mesh.
convert_to_fp32 (bool, optional, defaults to True) --
Whether to convert output and LSE to float32 for ring attention numerical stability.
rotate_method (str, optional, defaults to "allgather") --
Method to use for rotating key/value states across devices in ring attention. Currently, only "allgather"
is supported.0