BitPoet's picture
Update README.md
0bed8ea verified
|
Raw
History Blame Contribute Delete
4.19 kB
---
license: other
license_name: ideogram-non-commercial-model-agreement
license_link: https://huggingface.co/ideogram-ai/ideogram-4-fp8/blob/main/LICENSE.md
tags:
- comfyui
- ai-toolkit
- lora
---
<img src='resources/MERIDIQ_before_after_50_50.png' width='512'>
This is an early experimental LoRA that adds bbox guided inpainting / editing to the Ideogram 4
model. It is a work in progress, so the files here are snapshots at different
points in time while I adjust training parameters and build a better dataset.
I currently get the most stable results with the [checkpoint at step 4000](https://huggingface.co/BitPoet/Ideogram4-Inpaint-LoRA/blob/main/IdoInpaint_2_00004000.safetensors) of the second training run.
The dataset is very small, so do not expect any magic or precision. It is a starting point that
hopefully evolves over the next weeks as I prepare a bigger dataset and start over with training
with larger rank and finetuned parameters.
## Prerequisites
### Custom Node
You can find my custom node set on GitHub at [ComfyUI-bitpoet-IG4Inpaint](https://github.com/BitPoet/ComfyUI-bitpoet-IG4Inpaint).
The necessary workflow is included in the node or can be downloaded [here](https://github.com/BitPoet/ComfyUI-bitpoet-IG4Inpaint/blob/main/workflows/ideogram4_reference_workflow.json).
### ComfyUI Changes
Check out or download the [dev-ideogram4-inpaint branch](https://github.com/BitPoet/ComfyUI/tree/dev-ideogram4-inpaint) of my Comfy fork.
## Training
To train with reference images, you currently need to use a slightly adapted fork of AI-Toolkit.
You can find my bitpoet-ideogram4-refimages branch [here on GitHub](https://github.com/BitPoet/ai-toolkit/tree/bitpoet-ideogram4-refimages)
It also includes a fix for the UTF-8 / ANSII error lately popping up on Windows that has jobs fail at startup.
Note that this AI-Toolkit adaption has a switch for reference image support at the top of the dataset editor.
You have to switch this on every time you open a dataset with reference images.
An example training config for AI-Toolkit is also [in this repository](https://huggingface.co/BitPoet/Ideogram4-Inpaint-LoRA/blob/main/ai-toolkit_example_job_config.json).
I will add a small example dataset at some point.
If you want to assemble your own dataset, you might find my simple [node.js based dataset editor IdeoInCap](https://github.com/BitPoet/IdeoInCap) handy (that's short
for Ideogram4 Inpaint Captioning. I know, not my most creative moment.) It's tailored especially for Ideogram 4 image-reference-prompt datasets with a graphical bbox
editor and completion indication.
### Buzzwords (technical details)
What we changed in AI-Toolkit besides the dataset editor:
We added reference-latent token concatenation for Ideogram 4: each clean reference image is VAE-encoded and appended to the packed sequence as
`[text | noisy target | clean reference]`, with its own indicator, MRoPE time coordinate, and clean timestep. The transformer output and
diffusion loss are sliced to target tokens only, while bounding-box JSON prompts provide spatial edit conditioning.
These changes have to be mirrored in ComfyUI as well:
ComfyUI core: Extended the native Ideogram 4 model to accept reference latents and reproduce the training sequence `[text | noisy output | clean reference]`,
including the separate indicator, MRoPE coordinate, clean timestep, and output-only prediction slicing.
Custom node: Ideogram4ReferenceConditioning resizes and VAE-encodes a reference image to match the target latent, then attaches it only to positive
conditioning so the separate unconditional model remains unchanged.
## Credits
Credits go to:
- [ideogram-ai](https://huggingface.co/ideogram-ai) for releasing a highly interesting and high quality new image model.
- Ostris for [AI-Toolkit](https://github.com/ostris/ai-toolkit)
- [Comfy-Org](https://github.com/Comfy-Org) and [Kijai](https://huggingface.co/Kijai) for [ComfyUI](https://github.com/comfy-org/ComfyUI) itself and zero day support for Ideogram 4
## Disclaimer
I am in no way affiliated with Ideogram, Inc.
The LoRAs provided here are my own experimental work.
Please see the license linked above.