Update README.md
Browse files
README.md
CHANGED
|
@@ -53,12 +53,12 @@ It's tailored for especially for Ideogram 4 image-reference-prompt datasets, wit
|
|
| 53 |
What we changed in AI-Toolkit besides the dataset editor:
|
| 54 |
|
| 55 |
We added reference-latent token concatenation for Ideogram 4: each clean reference image is VAE-encoded and appended to the packed sequence as
|
| 56 |
-
[text | noisy target | clean reference], with its own indicator, MRoPE time coordinate, and clean timestep. The transformer output and
|
| 57 |
diffusion loss are sliced to target tokens only, while bounding-box JSON prompts provide spatial edit conditioning.
|
| 58 |
|
| 59 |
These changes have to be mirrored in ComfyUI as well:
|
| 60 |
|
| 61 |
-
ComfyUI core: Extended the native Ideogram 4 model to accept reference latents and reproduce the training sequence [text | noisy output | clean reference],
|
| 62 |
including the separate indicator, MRoPE coordinate, clean timestep, and output-only prediction slicing.
|
| 63 |
|
| 64 |
Custom node: Ideogram4ReferenceConditioning resizes and VAE-encodes a reference image to match the target latent, then attaches it only to positive
|
|
|
|
| 53 |
What we changed in AI-Toolkit besides the dataset editor:
|
| 54 |
|
| 55 |
We added reference-latent token concatenation for Ideogram 4: each clean reference image is VAE-encoded and appended to the packed sequence as
|
| 56 |
+
`[text | noisy target | clean reference]`, with its own indicator, MRoPE time coordinate, and clean timestep. The transformer output and
|
| 57 |
diffusion loss are sliced to target tokens only, while bounding-box JSON prompts provide spatial edit conditioning.
|
| 58 |
|
| 59 |
These changes have to be mirrored in ComfyUI as well:
|
| 60 |
|
| 61 |
+
ComfyUI core: Extended the native Ideogram 4 model to accept reference latents and reproduce the training sequence `[text | noisy output | clean reference]`,
|
| 62 |
including the separate indicator, MRoPE coordinate, clean timestep, and output-only prediction slicing.
|
| 63 |
|
| 64 |
Custom node: Ideogram4ReferenceConditioning resizes and VAE-encodes a reference image to match the target latent, then attaches it only to positive
|