Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / diffusers /pr_12652 /en /modular_diffusers /custom_blocks.md

rtrm

16 days ago

preview code

download

raw

12.5 kB

	# Building Custom Blocks

	[ModularPipelineBlocks](./pipeline_block) are the fundamental building blocks of a [ModularPipeline](/docs/diffusers/pr_12652/en/api/modular_diffusers/pipeline#diffusers.ModularPipeline). You can create custom blocks by defining their inputs, outputs, and computation logic. This guide demonstrates how to create and use a custom block.

	> [!TIP]
	> Explore the [Modular Diffusers Custom Blocks](https://huggingface.co/collections/diffusers/modular-diffusers-custom-blocks) collection for official custom blocks.

	## Project Structure

	Your custom block project should use the following structure:

	```shell
	.
	├── block.py
	└── modular_config.json
	```

	- `block.py` contains the custom block implementation
	- `modular_config.json` contains the metadata needed to load the block

	## Quick Start with Template

	The fastest way to create a custom block is to start from our template. The template provides a pre-configured project structure with `block.py` and `modular_config.json` files, plus commented examples showing how to define components, inputs, outputs, and the `__call__` method—so you can focus on your custom logic instead of boilerplate setup.

	### Download the template

	```python
	from diffusers import ModularPipelineBlocks

	model_id = "diffusers/custom-block-template"
	local_dir = model_id.split("/")[-1]

	blocks = ModularPipelineBlocks.from_pretrained(
	model_id,
	trust_remote_code=True,
	local_dir=local_dir
	)
	```

	This saves the template files to `custom-block-template/` locally or you could use `local_dir` to save to a specific location.

	### Edit locally

	Open `block.py` and implement your custom block. The template includes commented examples showing how to define each property. See the [Florence-2 example](#example-florence-2-image-annotator) below for a complete implementation.

	### Test your block

	```python
	from diffusers import ModularPipelineBlocks

	blocks = ModularPipelineBlocks.from_pretrained(local_dir, trust_remote_code=True)
	pipeline = blocks.init_pipeline()
	output = pipeline(...) # your inputs here
	```

	### Upload to the Hub

	```python
	pipeline.save_pretrained(local_dir, repo_id="your-username/your-block-name", push_to_hub=True)
	```

	## Example: Florence-2 Image Annotator

	This example creates a custom block with [Florence-2](https://huggingface.co/docs/transformers/model_doc/florence2) to process an input image and generate a mask for inpainting.

	### Define components

	Define the components the block needs, `Florence2ForConditionalGeneration` and its processor. When defining components, specify the `name` (how you'll access it in code), `type_hint` (the model class), and `pretrained_model_name_or_path` (where to load weights from).

	```python
	# Inside block.py
	from diffusers.modular_pipelines import ModularPipelineBlocks, ComponentSpec
	from transformers import AutoProcessor, Florence2ForConditionalGeneration

	class Florence2ImageAnnotatorBlock(ModularPipelineBlocks):

	@property
	def expected_components(self):
	return [
	ComponentSpec(
	name="image_annotator",
	type_hint=Florence2ForConditionalGeneration,
	pretrained_model_name_or_path="florence-community/Florence-2-base-ft",
	),
	ComponentSpec(
	name="image_annotator_processor",
	type_hint=AutoProcessor,
	pretrained_model_name_or_path="florence-community/Florence-2-base-ft",
	),
	]
	```

	### Define inputs and outputs

	Inputs include the image, annotation task, and prompt. Outputs include the generated mask and annotations.

	```python
	from typing import List, Union
	from PIL import Image
	from diffusers.modular_pipelines import InputParam, OutputParam

	class Florence2ImageAnnotatorBlock(ModularPipelineBlocks):

	# ... expected_components from above ...

	@property
	def inputs(self) -> List[InputParam]:
	return [
	InputParam(
	"image",
	type_hint=Union[Image.Image, List[Image.Image]],
	required=True,
	description="Image(s) to annotate",
	),
	InputParam(
	"annotation_task",
	type_hint=str,
	default="",
	description="Annotation task to perform (e.g., , , )",
	),
	InputParam(
	"annotation_prompt",
	type_hint=str,
	required=True,
	description="Prompt to provide context for the annotation task",
	),
	InputParam(
	"annotation_output_type",
	type_hint=str,
	default="mask_image",
	description="Output type: 'mask_image', 'mask_overlay', or 'bounding_box'",
	),
	]

	@property
	def intermediate_outputs(self) -> List[OutputParam]:
	return [
	OutputParam(
	"mask_image",
	type_hint=Image.Image,
	description="Inpainting mask for the input image",
	),
	OutputParam(
	"annotations",
	type_hint=dict,
	description="Raw annotation predictions",
	),
	OutputParam(
	"image",
	type_hint=Image.Image,
	description="Annotated image",
	),
	]
	```

	### Implement the `__call__` method

	The `__call__` method contains the block's logic. Access inputs via `block_state`, run your computation, and set outputs back to `block_state`.

	```python
	import torch
	from diffusers.modular_pipelines import PipelineState

	class Florence2ImageAnnotatorBlock(ModularPipelineBlocks):

	# ... expected_components, inputs, intermediate_outputs from above ...

	@torch.no_grad()
	def __call__(self, components, state: PipelineState) -> PipelineState:
	block_state = self.get_block_state(state)

	images, annotation_task_prompt = self.prepare_inputs(
	block_state.image, block_state.annotation_prompt
	)
	task = block_state.annotation_task
	fill = block_state.fill

	annotations = self.get_annotations(
	components, images, annotation_task_prompt, task
	)
	block_state.annotations = annotations
	if block_state.annotation_output_type == "mask_image":
	block_state.mask_image = self.prepare_mask(images, annotations)
	else:
	block_state.mask_image = None

	if block_state.annotation_output_type == "mask_overlay":
	block_state.image = self.prepare_mask(images, annotations, overlay=True, fill=fill)

	elif block_state.annotation_output_type == "bounding_box":
	block_state.image = self.prepare_bounding_boxes(images, annotations)

	self.set_block_state(state, block_state)

	return components, state

	# Helper methods for mask/bounding box generation...
	```

	> [!TIP]
	> See the complete implementation at [diffusers/Florence2-image-Annotator](https://huggingface.co/diffusers/Florence2-image-Annotator).

	## Using Custom Blocks

	Load a custom block with [from_pretrained()](/docs/diffusers/pr_12652/en/api/modular_diffusers/pipeline#diffusers.ModularPipeline.from_pretrained) and set `trust_remote_code=True`.

	```py
	import torch
	from diffusers import ModularPipeline
	from diffusers.utils import load_image

	# Load the Florence-2 annotator pipeline
	image_annotator = ModularPipeline.from_pretrained(
	"diffusers/Florence2-image-Annotator",
	trust_remote_code=True
	)

	# Check the docstring to see inputs/outputs
	print(image_annotator.blocks.doc)
	```

	Use the block to generate a mask:

	```python
	image_annotator.load_components(torch_dtype=torch.bfloat16)
	image_annotator.to("cuda")

	image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg")
	image = image.resize((1024, 1024))
	prompt = ["A red car"]
	annotation_task = ""
	annotation_prompt = ["the car"]

	mask_image = image_annotator_node(
	prompt=prompt,
	image=image,
	annotation_task=annotation_task,
	annotation_prompt=annotation_prompt,
	annotation_output_type="mask_image",
	).images
	mask_image[0].save("car-mask.png")
	```

	Compose it with other blocks to create a new pipeline:

	```python
	# Get the annotator block
	annotator_block = image_annotator.blocks

	# Get an inpainting workflow and insert the annotator at the beginning
	inpaint_blocks = ModularPipeline.from_pretrained("Qwen/Qwen-Image").blocks.get_workflow("inpainting")
	inpaint_blocks.sub_blocks.insert("image_annotator", annotator_block, 0)

	# Initialize the combined pipeline
	pipe = inpaint_blocks.init_pipeline()
	pipe.load_components(torch_dtype=torch.float16, device="cuda")

	# Now the pipeline automatically generates masks from prompts
	output = pipe(
	prompt=prompt,
	image=image,
	annotation_task=annotation_task,
	annotation_prompt=annotation_prompt,
	annotation_output_type="mask_image",
	num_inference_steps=35,
	guidance_scale=7.5,
	strength=0.95,
	output="images"
	)
	output[0].save("florence-inpainting.png")
	```

	## Editing custom blocks

	Edit custom blocks by downloading it locally. This is the same workflow as the [Quick Start with Template](#quick-start-with-template), but starting from an existing block instead of the template.

	Use the `local_dir` argument to download a custom block to a specific folder:

	```python
	from diffusers import ModularPipelineBlocks

	# Download to a local folder for editing
	annotator_block = ModularPipelineBlocks.from_pretrained(
	"diffusers/Florence2-image-Annotator",
	trust_remote_code=True,
	local_dir="./my-florence-block"
	)
	```

	Any changes made to the block files in this folder will be reflected when you load the block again. When you're ready to share your changes, upload to a new repository:

	```python
	pipeline = annotator_block.init_pipeline()
	pipeline.save_pretrained("./my-florence-block", repo_id="your-username/my-custom-florence", push_to_hub=True)
	```

	## Next Steps

	This guide covered creating a single custom block. Learn how to compose multiple blocks together:

	- [SequentialPipelineBlocks](./sequential_pipeline_blocks): Chain blocks to execute in sequence
	- [ConditionalPipelineBlocks](./auto_pipeline_blocks): Create conditional blocks that select different execution paths
	- [LoopSequentialPipelineBlocks](./loop_sequential_pipeline_blocks): Define an iterative workflows like the denoising loop

	Make your custom block work with Mellon's visual interface. See the [Mellon Custom Blocks](./mellon) guide.

	Browse the [Modular Diffusers Custom Blocks](https://huggingface.co/collections/diffusers/modular-diffusers-custom-blocks) collection for inspiration and ready-to-use blocks.

	## Dependencies

	Declaring package dependencies in custom blocks prevents runtime import errors later on. Diffusers validates the dependencies and returns a warning if a package is missing or incompatible.

	Set a `_requirements` attribute in your block class, mapping package names to version specifiers.

	```py
	from diffusers.modular_pipelines import PipelineBlock

	class MyCustomBlock(PipelineBlock):
	_requirements = {
	"transformers": ">=4.44.0",
	"sentencepiece": ">=0.2.0"
	}
	```

	When there are blocks with different requirements, Diffusers merges their requirements.

	```py
	from diffusers.modular_pipelines import SequentialPipelineBlocks

	class BlockA(PipelineBlock):
	_requirements = {"transformers": ">=4.44.0"}
	# ...

	class BlockB(PipelineBlock):
	_requirements = {"sentencepiece": ">=0.2.0"}
	# ...

	pipe = SequentialPipelineBlocks.from_blocks_dict({
	"block_a": BlockA,
	"block_b": BlockB,
	})
	```

	When this block is saved with [save_pretrained()](/docs/diffusers/pr_12652/en/api/modular_diffusers/pipeline#diffusers.ModularPipeline.save_pretrained), the requirements are saved to the `modular_config.json` file. When this block is loaded, Diffusers checks each requirement against the current environment. If there is a mismatch or a package isn't found, Diffusers returns the following warning.

	```md
	# missing package
	xyz-package was specified in the requirements but wasn't found in the current environment.

	# version mismatch
	xyz requirement 'specific-version' is not satisfied by the installed version 'actual-version'. Things might work unexpected.
	```

Xet Storage Details

Size:: 12.5 kB
Xet hash:: 7a7339361b4a685daa61848d8bd87bbfe40d691246fa523fc3c4789b95aa4c5e

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.