huangfeice's picture
Update README.md
a324fc8 verified
|
Raw
History Blame Contribute Delete
3.75 kB
---
license: apache-2.0
library_name: ComfyUI
pipeline_tag: image-to-image
tags:
- comfyui
- image-editing
- joyai
base_model: jdopensource/JoyAI-Image-Edit-Diffusers
---
# JoyAI-Image-Edit (ComfyUI weights)
Single-file `.safetensors` checkpoints of [JoyAI-Image-Edit](https://github.com/jd-opensource/JoyAI-Image), repackaged for **native ComfyUI** support (no custom node required).
JoyAI-Image-Edit is the single-image instruction-guided editing model of the [JoyAI-Image](https://github.com/jd-opensource/JoyAI-Image) family. It takes one reference image plus a text instruction and generates the edited result.
## Files
| File | Size | Goes into | Component |
|------|------|-----------|-----------|
| `diffusion_models/joy_image_edit_bf16.safetensors` | ~31 GB | `ComfyUI/models/diffusion_models/` | `JoyImageEditTransformer3DModel` (bf16) |
| `text_encoders/qwen3vl_joyimage_bf16.safetensors` | ~17 GB | `ComfyUI/models/text_encoders/` | Qwen3-VL-8B text encoder (bf16) |
| `vae/joy_image_edit_vae.safetensors` | ~243 MB | `ComfyUI/models/vae/` | `AutoencoderKLWan` |
The repo's directory layout already matches `ComfyUI/models/`, so a single `hf download` into your models root drops every file where it needs to go.
## Installation
The model runs natively in ComfyUI. Native support is proposed upstream in [Comfy-Org/ComfyUI#14428](https://github.com/Comfy-Org/ComfyUI/pull/14428); until it is merged, install the fork branch:
```bash
git clone -b joyimage-edit-pr https://github.com/feice-huang/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
```
Once the PR is merged upstream, the stock ComfyUI release will run these weights with no fork needed.
Then download the weights straight into `ComfyUI/models/`:
```bash
hf download jdopensource/JoyAI-Image-Edit-ComfyUI \
--local-dir /path/to/ComfyUI/models
```
Restart ComfyUI.
## Usage
Build the graph from these native nodes:
1. **Load Diffusion Model** (`UNETLoader`) β†’ `diffusion_models/joy_image_edit_bf16.safetensors`
2. **Load CLIP** (`CLIPLoader`) β†’ `text_encoders/qwen3vl_joyimage_bf16.safetensors`, type `joyimage`
3. **Load VAE** (`VAELoader`) β†’ `vae/joy_image_edit_vae.safetensors`
4. **Load Image** (`LoadImage`) for the reference
5. **TextEncodeJoyImageEdit** β€” feed `clip`, `vae`, the instruction, and the reference `image`. Wire one instance for the positive prompt and one (empty prompt, same image) for the negative. The node bucket-resizes the reference to the 1024-base buckets, VAE-encodes it, and appends the reference latent to the conditioning; its `image` output feeds `VAEDecode` / empty-latent sizing.
6. **KSampler** β†’ **VAEDecode** β†’ **SaveImage**
Example workflow: [workflow_joyimage_edit.json](https://github.com/user-attachments/files/28871922/workflow_joyimage_edit.json)
## Recommended parameters
| Parameter | Value |
|-----------|-------|
| Steps | 40 |
| CFG | 4.0 |
| Sampler | `euler` |
| Scheduler | `simple` |
| dtype | bf16 |
| Resolution | auto (1024-base buckets) |
## GGUF quantizations
Lower-bit GGUF quants of the transformer and text encoder are available at [huangfeice/JoyAI-Image-Edit-Diffusers-GGUF](https://huggingface.co/huangfeice/JoyAI-Image-Edit-Diffusers-GGUF) (community contribution). The VAE here is the only VAE you need β€” GGUF doesn't quantize the VAE.
## Links
- Source code and documentation: [github.com/jd-opensource/JoyAI-Image](https://github.com/jd-opensource/JoyAI-Image)
- Original Diffusers-format weights: [jdopensource/JoyAI-Image-Edit-Diffusers](https://huggingface.co/jdopensource/JoyAI-Image-Edit-Diffusers)
- Multi-image edit model (ComfyUI): [jdopensource/JoyAI-Image-Edit-Plus-ComfyUI](https://huggingface.co/jdopensource/JoyAI-Image-Edit-Plus-ComfyUI)