| --- |
| license: other |
| license_name: ideogram-non-commercial-model-agreement |
| license_link: https://huggingface.co/ideogram-ai/ideogram-4-fp8/blob/main/LICENSE.md |
| tags: |
| - comfyui |
| - ai-toolkit |
| - lora |
| --- |
| |
| <img src='resources/MERIDIQ_before_after_50_50.png' width='512'> |
|
|
| This is an early experimental LoRA that adds bbox guided inpainting / editing to the Ideogram 4 |
| model. It is a work in progress, so the files here are snapshots at different |
| points in time while I adjust training parameters and build a better dataset. |
|
|
| I currently get the most stable results with the [checkpoint at step 4000](https://huggingface.co/BitPoet/Ideogram4-Inpaint-LoRA/blob/main/IdoInpaint_2_00004000.safetensors) of the second training run. |
|
|
| The dataset is very small, so do not expect any magic or precision. It is a starting point that |
| hopefully evolves over the next weeks as I prepare a bigger dataset and start over with training |
| with larger rank and finetuned parameters. |
|
|
| ## Prerequisites |
|
|
| ### Custom Node |
|
|
| You can find my custom node set on GitHub at [ComfyUI-bitpoet-IG4Inpaint](https://github.com/BitPoet/ComfyUI-bitpoet-IG4Inpaint). |
| The necessary workflow is included in the node or can be downloaded [here](https://github.com/BitPoet/ComfyUI-bitpoet-IG4Inpaint/blob/main/workflows/ideogram4_reference_workflow.json). |
|
|
| ### ComfyUI Changes |
|
|
| Check out or download the [dev-ideogram4-inpaint branch](https://github.com/BitPoet/ComfyUI/tree/dev-ideogram4-inpaint) of my Comfy fork. |
|
|
| ## Training |
|
|
| To train with reference images, you currently need to use a slightly adapted fork of AI-Toolkit. |
| You can find my bitpoet-ideogram4-refimages branch [here on GitHub](https://github.com/BitPoet/ai-toolkit/tree/bitpoet-ideogram4-refimages) |
|
|
| It also includes a fix for the UTF-8 / ANSII error lately popping up on Windows that has jobs fail at startup. |
|
|
| Note that this AI-Toolkit adaption has a switch for reference image support at the top of the dataset editor. |
| You have to switch this on every time you open a dataset with reference images. |
|
|
| An example training config for AI-Toolkit is also [in this repository](https://huggingface.co/BitPoet/Ideogram4-Inpaint-LoRA/blob/main/ai-toolkit_example_job_config.json). |
|
|
| I will add a small example dataset at some point. |
|
|
| If you want to assemble your own dataset, you might find my simple [node.js based dataset editor IdeoInCap](https://github.com/BitPoet/IdeoInCap) handy (that's short |
| for Ideogram4 Inpaint Captioning. I know, not my most creative moment.) It's tailored especially for Ideogram 4 image-reference-prompt datasets with a graphical bbox |
| editor and completion indication. |
|
|
| ### Buzzwords (technical details) |
|
|
| What we changed in AI-Toolkit besides the dataset editor: |
|
|
| We added reference-latent token concatenation for Ideogram 4: each clean reference image is VAE-encoded and appended to the packed sequence as |
| `[text | noisy target | clean reference]`, with its own indicator, MRoPE time coordinate, and clean timestep. The transformer output and |
| diffusion loss are sliced to target tokens only, while bounding-box JSON prompts provide spatial edit conditioning. |
|
|
| These changes have to be mirrored in ComfyUI as well: |
|
|
| ComfyUI core: Extended the native Ideogram 4 model to accept reference latents and reproduce the training sequence `[text | noisy output | clean reference]`, |
| including the separate indicator, MRoPE coordinate, clean timestep, and output-only prediction slicing. |
|
|
| Custom node: Ideogram4ReferenceConditioning resizes and VAE-encodes a reference image to match the target latent, then attaches it only to positive |
| conditioning so the separate unconditional model remains unchanged. |
|
|
| ## Credits |
|
|
| Credits go to: |
| - [ideogram-ai](https://huggingface.co/ideogram-ai) for releasing a highly interesting and high quality new image model. |
| - Ostris for [AI-Toolkit](https://github.com/ostris/ai-toolkit) |
| - [Comfy-Org](https://github.com/Comfy-Org) and [Kijai](https://huggingface.co/Kijai) for [ComfyUI](https://github.com/comfy-org/ComfyUI) itself and zero day support for Ideogram 4 |
|
|
|
|
| ## Disclaimer |
|
|
| I am in no way affiliated with Ideogram, Inc. |
| The LoRAs provided here are my own experimental work. |
| Please see the license linked above. |