BitPoet
/

Ideogram4-Inpaint-LoRA

Model card Files Files and versions

BitPoet commited on 16 days ago

Commit

8717741

·

verified ·

1 Parent(s): 9d0ddd6

Update README.md

Add technical details.

Files changed (1) hide show

README.md +19 -2

README.md CHANGED Viewed

@@ -30,6 +30,23 @@ You can find my bitpoet-ideogram4-refimages branch [here on GitHub](https://gith
 It also includes a fix for the UTF-8 / ANSII error lately popping up on Windows that has jobs fail at startup.
-Note that this AI-Toolkit adaption is targeted at Ideogram 4 with reference images and JSON prompts in the dataset editor, so you may not be able to use it to train regular LoRAs.
-I will add a small example dataset at some point.

 It also includes a fix for the UTF-8 / ANSII error lately popping up on Windows that has jobs fail at startup.
+Note that this AI-Toolkit adaption is targeted at Ideogram 4 with reference images and JSON prompts in the dataset editor, so you may not be able to use it to
+train regular LoRAs.
+I will add a small example dataset at some point.
+### Buzzwords (technical details)
+What we changed in AI-Toolkit besides the dataset editor:
+We added reference-latent token concatenation for Ideogram 4: each clean reference image is VAE-encoded and appended to the packed sequence as
+[text | noisy target | clean reference], with its own indicator, MRoPE time coordinate, and clean timestep. The transformer output and
+diffusion loss are sliced to target tokens only, while bounding-box JSON prompts provide spatial edit conditioning.
+These changes have to be mirrored in ComfyUI as well:
+ComfyUI core: Extended the native Ideogram 4 model to accept reference latents and reproduce the training sequence [text | noisy output | clean reference],
+including the separate indicator, MRoPE coordinate, clean timestep, and output-only prediction slicing.
+Custom node: Ideogram4ReferenceConditioning resizes and VAE-encodes a reference image to match the target latent, then attaches it only to positive
+conditioning so the separate unconditional model remains unchanged.