Buckets:
| # Dance Diffusion | |
| [Dance Diffusion](https://github.com/Harmonai-org/sample-generator) is by Zach Evans. | |
| Dance Diffusion is the first in a suite of generative audio tools for producers and musicians released by [Harmonai](https://github.com/Harmonai-org). | |
| > [!TIP] | |
| > Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines. | |
| ## DanceDiffusionPipeline[[diffusers.DanceDiffusionPipeline]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.DanceDiffusionPipeline</name><anchor>diffusers.DanceDiffusionPipeline</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/pipelines/dance_diffusion/pipeline_dance_diffusion.py#L37</source><parameters>[{"name": "unet", "val": ": UNet1DModel"}, {"name": "scheduler", "val": ": SchedulerMixin"}]</parameters><paramsdesc>- **unet** ([UNet1DModel](/docs/diffusers/pr_12229/en/api/models/unet#diffusers.UNet1DModel)) -- | |
| A `UNet1DModel` to denoise the encoded audio. | |
| - **scheduler** ([SchedulerMixin](/docs/diffusers/pr_12229/en/api/schedulers/overview#diffusers.SchedulerMixin)) -- | |
| A scheduler to be used in combination with `unet` to denoise the encoded audio latents. Can be one of | |
| [IPNDMScheduler](/docs/diffusers/pr_12229/en/api/schedulers/ipndm#diffusers.IPNDMScheduler).</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Pipeline for audio generation. | |
| This model inherits from [DiffusionPipeline](/docs/diffusers/pr_12229/en/api/pipelines/overview#diffusers.DiffusionPipeline). Check the superclass documentation for the generic methods | |
| implemented for all pipelines (downloading, saving, running on a particular device, etc.). | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>__call__</name><anchor>diffusers.DanceDiffusionPipeline.__call__</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/pipelines/dance_diffusion/pipeline_dance_diffusion.py#L59</source><parameters>[{"name": "batch_size", "val": ": int = 1"}, {"name": "num_inference_steps", "val": ": int = 100"}, {"name": "generator", "val": ": typing.Union[torch._C.Generator, typing.List[torch._C.Generator], NoneType] = None"}, {"name": "audio_length_in_s", "val": ": typing.Optional[float] = None"}, {"name": "return_dict", "val": ": bool = True"}]</parameters><paramsdesc>- **batch_size** (`int`, *optional*, defaults to 1) -- | |
| The number of audio samples to generate. | |
| - **num_inference_steps** (`int`, *optional*, defaults to 50) -- | |
| The number of denoising steps. More denoising steps usually lead to a higher-quality audio sample at | |
| the expense of slower inference. | |
| - **generator** (`torch.Generator`, *optional*) -- | |
| A [`torch.Generator`](https://pytorch.org/docs/stable/generated/torch.Generator.html) to make | |
| generation deterministic. | |
| - **audio_length_in_s** (`float`, *optional*, defaults to `self.unet.config.sample_size/self.unet.config.sample_rate`) -- | |
| The length of the generated audio sample in seconds. | |
| - **return_dict** (`bool`, *optional*, defaults to `True`) -- | |
| Whether or not to return a [AudioPipelineOutput](/docs/diffusers/pr_12229/en/api/pipelines/dance_diffusion#diffusers.AudioPipelineOutput) instead of a plain tuple.</paramsdesc><paramgroups>0</paramgroups><rettype>[AudioPipelineOutput](/docs/diffusers/pr_12229/en/api/pipelines/dance_diffusion#diffusers.AudioPipelineOutput) or `tuple`</rettype><retdesc>If `return_dict` is `True`, [AudioPipelineOutput](/docs/diffusers/pr_12229/en/api/pipelines/dance_diffusion#diffusers.AudioPipelineOutput) is returned, otherwise a `tuple` is | |
| returned where the first element is a list with the generated audio.</retdesc></docstring> | |
| The call function to the pipeline for generation. | |
| <ExampleCodeBlock anchor="diffusers.DanceDiffusionPipeline.__call__.example"> | |
| Example: | |
| ```py | |
| from diffusers import DiffusionPipeline | |
| from scipy.io.wavfile import write | |
| model_id = "harmonai/maestro-150k" | |
| pipe = DiffusionPipeline.from_pretrained(model_id) | |
| pipe = pipe.to("cuda") | |
| audios = pipe(audio_length_in_s=4.0).audios | |
| # To save locally | |
| for i, audio in enumerate(audios): | |
| write(f"maestro_test_{i}.wav", pipe.unet.sample_rate, audio.transpose()) | |
| # To display in google colab | |
| import IPython.display as ipd | |
| for audio in audios: | |
| display(ipd.Audio(audio, rate=pipe.unet.sample_rate)) | |
| ``` | |
| </ExampleCodeBlock> | |
| </div></div> | |
| ## AudioPipelineOutput[[diffusers.AudioPipelineOutput]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class diffusers.AudioPipelineOutput</name><anchor>diffusers.AudioPipelineOutput</anchor><source>https://github.com/huggingface/diffusers/blob/vr_12229/src/diffusers/pipelines/pipeline_utils.py#L132</source><parameters>[{"name": "audios", "val": ": ndarray"}]</parameters><paramsdesc>- **audios** (`np.ndarray`) -- | |
| List of denoised audio samples of a NumPy array of shape `(batch_size, num_channels, sample_rate)`.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| Output class for audio pipelines. | |
| </div> | |
| <EditOnGithub source="https://github.com/huggingface/diffusers/blob/main/docs/source/en/api/pipelines/dance_diffusion.md" /> |
Xet Storage Details
- Size:
- 5.53 kB
- Xet hash:
- be2deee1a526de9d89df82d45669a18a97eceb1d17f95013060fa96f6939eb2b
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.