Buckets:
| # Building Custom Blocks | |
| [ModularPipelineBlocks](./pipeline_block) are the fundamental building blocks of a [ModularPipeline](/docs/diffusers/pr_12652/en/api/modular_diffusers/pipeline#diffusers.ModularPipeline). You can create custom blocks by defining their inputs, outputs, and computation logic. This guide demonstrates how to create and use a custom block. | |
| > [!TIP] | |
| > Explore the [Modular Diffusers Custom Blocks](https://huggingface.co/collections/diffusers/modular-diffusers-custom-blocks) collection for official custom blocks. | |
| ## Project Structure | |
| Your custom block project should use the following structure: | |
| ```shell | |
| . | |
| ├── block.py | |
| └── modular_config.json | |
| ``` | |
| - `block.py` contains the custom block implementation | |
| - `modular_config.json` contains the metadata needed to load the block | |
| ## Quick Start with Template | |
| The fastest way to create a custom block is to start from our template. The template provides a pre-configured project structure with `block.py` and `modular_config.json` files, plus commented examples showing how to define components, inputs, outputs, and the `__call__` method—so you can focus on your custom logic instead of boilerplate setup. | |
| ### Download the template | |
| ```python | |
| from diffusers import ModularPipelineBlocks | |
| model_id = "diffusers/custom-block-template" | |
| local_dir = model_id.split("/")[-1] | |
| blocks = ModularPipelineBlocks.from_pretrained( | |
| model_id, | |
| trust_remote_code=True, | |
| local_dir=local_dir | |
| ) | |
| ``` | |
| This saves the template files to `custom-block-template/` locally or you could use `local_dir` to save to a specific location. | |
| ### Edit locally | |
| Open `block.py` and implement your custom block. The template includes commented examples showing how to define each property. See the [Florence-2 example](#example-florence-2-image-annotator) below for a complete implementation. | |
| ### Test your block | |
| ```python | |
| from diffusers import ModularPipelineBlocks | |
| blocks = ModularPipelineBlocks.from_pretrained(local_dir, trust_remote_code=True) | |
| pipeline = blocks.init_pipeline() | |
| output = pipeline(...) # your inputs here | |
| ``` | |
| ### Upload to the Hub | |
| ```python | |
| pipeline.save_pretrained(local_dir, repo_id="your-username/your-block-name", push_to_hub=True) | |
| ``` | |
| ## Example: Florence-2 Image Annotator | |
| This example creates a custom block with [Florence-2](https://huggingface.co/docs/transformers/model_doc/florence2) to process an input image and generate a mask for inpainting. | |
| ### Define components | |
| Define the components the block needs, `Florence2ForConditionalGeneration` and its processor. When defining components, specify the `name` (how you'll access it in code), `type_hint` (the model class), and `pretrained_model_name_or_path` (where to load weights from). | |
| ```python | |
| # Inside block.py | |
| from diffusers.modular_pipelines import ModularPipelineBlocks, ComponentSpec | |
| from transformers import AutoProcessor, Florence2ForConditionalGeneration | |
| class Florence2ImageAnnotatorBlock(ModularPipelineBlocks): | |
| @property | |
| def expected_components(self): | |
| return [ | |
| ComponentSpec( | |
| name="image_annotator", | |
| type_hint=Florence2ForConditionalGeneration, | |
| pretrained_model_name_or_path="florence-community/Florence-2-base-ft", | |
| ), | |
| ComponentSpec( | |
| name="image_annotator_processor", | |
| type_hint=AutoProcessor, | |
| pretrained_model_name_or_path="florence-community/Florence-2-base-ft", | |
| ), | |
| ] | |
| ``` | |
| ### Define inputs and outputs | |
| Inputs include the image, annotation task, and prompt. Outputs include the generated mask and annotations. | |
| ```python | |
| from typing import List, Union | |
| from PIL import Image | |
| from diffusers.modular_pipelines import InputParam, OutputParam | |
| class Florence2ImageAnnotatorBlock(ModularPipelineBlocks): | |
| # ... expected_components from above ... | |
| @property | |
| def inputs(self) -> List[InputParam]: | |
| return [ | |
| InputParam( | |
| "image", | |
| type_hint=Union[Image.Image, List[Image.Image]], | |
| required=True, | |
| description="Image(s) to annotate", | |
| ), | |
| InputParam( | |
| "annotation_task", | |
| type_hint=str, | |
| default="", | |
| description="Annotation task to perform (e.g., , , )", | |
| ), | |
| InputParam( | |
| "annotation_prompt", | |
| type_hint=str, | |
| required=True, | |
| description="Prompt to provide context for the annotation task", | |
| ), | |
| InputParam( | |
| "annotation_output_type", | |
| type_hint=str, | |
| default="mask_image", | |
| description="Output type: 'mask_image', 'mask_overlay', or 'bounding_box'", | |
| ), | |
| ] | |
| @property | |
| def intermediate_outputs(self) -> List[OutputParam]: | |
| return [ | |
| OutputParam( | |
| "mask_image", | |
| type_hint=Image.Image, | |
| description="Inpainting mask for the input image", | |
| ), | |
| OutputParam( | |
| "annotations", | |
| type_hint=dict, | |
| description="Raw annotation predictions", | |
| ), | |
| OutputParam( | |
| "image", | |
| type_hint=Image.Image, | |
| description="Annotated image", | |
| ), | |
| ] | |
| ``` | |
| ### Implement the `__call__` method | |
| The `__call__` method contains the block's logic. Access inputs via `block_state`, run your computation, and set outputs back to `block_state`. | |
| ```python | |
| import torch | |
| from diffusers.modular_pipelines import PipelineState | |
| class Florence2ImageAnnotatorBlock(ModularPipelineBlocks): | |
| # ... expected_components, inputs, intermediate_outputs from above ... | |
| @torch.no_grad() | |
| def __call__(self, components, state: PipelineState) -> PipelineState: | |
| block_state = self.get_block_state(state) | |
| images, annotation_task_prompt = self.prepare_inputs( | |
| block_state.image, block_state.annotation_prompt | |
| ) | |
| task = block_state.annotation_task | |
| fill = block_state.fill | |
| annotations = self.get_annotations( | |
| components, images, annotation_task_prompt, task | |
| ) | |
| block_state.annotations = annotations | |
| if block_state.annotation_output_type == "mask_image": | |
| block_state.mask_image = self.prepare_mask(images, annotations) | |
| else: | |
| block_state.mask_image = None | |
| if block_state.annotation_output_type == "mask_overlay": | |
| block_state.image = self.prepare_mask(images, annotations, overlay=True, fill=fill) | |
| elif block_state.annotation_output_type == "bounding_box": | |
| block_state.image = self.prepare_bounding_boxes(images, annotations) | |
| self.set_block_state(state, block_state) | |
| return components, state | |
| # Helper methods for mask/bounding box generation... | |
| ``` | |
| > [!TIP] | |
| > See the complete implementation at [diffusers/Florence2-image-Annotator](https://huggingface.co/diffusers/Florence2-image-Annotator). | |
| ## Using Custom Blocks | |
| Load a custom block with [from_pretrained()](/docs/diffusers/pr_12652/en/api/modular_diffusers/pipeline#diffusers.ModularPipeline.from_pretrained) and set `trust_remote_code=True`. | |
| ```py | |
| import torch | |
| from diffusers import ModularPipeline | |
| from diffusers.utils import load_image | |
| # Load the Florence-2 annotator pipeline | |
| image_annotator = ModularPipeline.from_pretrained( | |
| "diffusers/Florence2-image-Annotator", | |
| trust_remote_code=True | |
| ) | |
| # Check the docstring to see inputs/outputs | |
| print(image_annotator.blocks.doc) | |
| ``` | |
| Use the block to generate a mask: | |
| ```python | |
| image_annotator.load_components(torch_dtype=torch.bfloat16) | |
| image_annotator.to("cuda") | |
| image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg") | |
| image = image.resize((1024, 1024)) | |
| prompt = ["A red car"] | |
| annotation_task = "" | |
| annotation_prompt = ["the car"] | |
| mask_image = image_annotator_node( | |
| prompt=prompt, | |
| image=image, | |
| annotation_task=annotation_task, | |
| annotation_prompt=annotation_prompt, | |
| annotation_output_type="mask_image", | |
| ).images | |
| mask_image[0].save("car-mask.png") | |
| ``` | |
| Compose it with other blocks to create a new pipeline: | |
| ```python | |
| # Get the annotator block | |
| annotator_block = image_annotator.blocks | |
| # Get an inpainting workflow and insert the annotator at the beginning | |
| inpaint_blocks = ModularPipeline.from_pretrained("Qwen/Qwen-Image").blocks.get_workflow("inpainting") | |
| inpaint_blocks.sub_blocks.insert("image_annotator", annotator_block, 0) | |
| # Initialize the combined pipeline | |
| pipe = inpaint_blocks.init_pipeline() | |
| pipe.load_components(torch_dtype=torch.float16, device="cuda") | |
| # Now the pipeline automatically generates masks from prompts | |
| output = pipe( | |
| prompt=prompt, | |
| image=image, | |
| annotation_task=annotation_task, | |
| annotation_prompt=annotation_prompt, | |
| annotation_output_type="mask_image", | |
| num_inference_steps=35, | |
| guidance_scale=7.5, | |
| strength=0.95, | |
| output="images" | |
| ) | |
| output[0].save("florence-inpainting.png") | |
| ``` | |
| ## Editing custom blocks | |
| Edit custom blocks by downloading it locally. This is the same workflow as the [Quick Start with Template](#quick-start-with-template), but starting from an existing block instead of the template. | |
| Use the `local_dir` argument to download a custom block to a specific folder: | |
| ```python | |
| from diffusers import ModularPipelineBlocks | |
| # Download to a local folder for editing | |
| annotator_block = ModularPipelineBlocks.from_pretrained( | |
| "diffusers/Florence2-image-Annotator", | |
| trust_remote_code=True, | |
| local_dir="./my-florence-block" | |
| ) | |
| ``` | |
| Any changes made to the block files in this folder will be reflected when you load the block again. When you're ready to share your changes, upload to a new repository: | |
| ```python | |
| pipeline = annotator_block.init_pipeline() | |
| pipeline.save_pretrained("./my-florence-block", repo_id="your-username/my-custom-florence", push_to_hub=True) | |
| ``` | |
| ## Next Steps | |
| This guide covered creating a single custom block. Learn how to compose multiple blocks together: | |
| - [SequentialPipelineBlocks](./sequential_pipeline_blocks): Chain blocks to execute in sequence | |
| - [ConditionalPipelineBlocks](./auto_pipeline_blocks): Create conditional blocks that select different execution paths | |
| - [LoopSequentialPipelineBlocks](./loop_sequential_pipeline_blocks): Define an iterative workflows like the denoising loop | |
| Make your custom block work with Mellon's visual interface. See the [Mellon Custom Blocks](./mellon) guide. | |
| Browse the [Modular Diffusers Custom Blocks](https://huggingface.co/collections/diffusers/modular-diffusers-custom-blocks) collection for inspiration and ready-to-use blocks. | |
| ## Dependencies | |
| Declaring package dependencies in custom blocks prevents runtime import errors later on. Diffusers validates the dependencies and returns a warning if a package is missing or incompatible. | |
| Set a `_requirements` attribute in your block class, mapping package names to version specifiers. | |
| ```py | |
| from diffusers.modular_pipelines import PipelineBlock | |
| class MyCustomBlock(PipelineBlock): | |
| _requirements = { | |
| "transformers": ">=4.44.0", | |
| "sentencepiece": ">=0.2.0" | |
| } | |
| ``` | |
| When there are blocks with different requirements, Diffusers merges their requirements. | |
| ```py | |
| from diffusers.modular_pipelines import SequentialPipelineBlocks | |
| class BlockA(PipelineBlock): | |
| _requirements = {"transformers": ">=4.44.0"} | |
| # ... | |
| class BlockB(PipelineBlock): | |
| _requirements = {"sentencepiece": ">=0.2.0"} | |
| # ... | |
| pipe = SequentialPipelineBlocks.from_blocks_dict({ | |
| "block_a": BlockA, | |
| "block_b": BlockB, | |
| }) | |
| ``` | |
| When this block is saved with [save_pretrained()](/docs/diffusers/pr_12652/en/api/modular_diffusers/pipeline#diffusers.ModularPipeline.save_pretrained), the requirements are saved to the `modular_config.json` file. When this block is loaded, Diffusers checks each requirement against the current environment. If there is a mismatch or a package isn't found, Diffusers returns the following warning. | |
| ```md | |
| # missing package | |
| xyz-package was specified in the requirements but wasn't found in the current environment. | |
| # version mismatch | |
| xyz requirement 'specific-version' is not satisfied by the installed version 'actual-version'. Things might work unexpected. | |
| ``` | |
Xet Storage Details
- Size:
- 12.5 kB
- Xet hash:
- 7a7339361b4a685daa61848d8bd87bbfe40d691246fa523fc3c4789b95aa4c5e
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.