license: apache-2.0
library_name: ComfyUI
pipeline_tag: image-to-image
tags:
- comfyui
- image-editing
- joyai
base_model: jdopensource/JoyAI-Image-Edit-Diffusers
JoyAI-Image-Edit (ComfyUI weights)
Single-file .safetensors checkpoints of JoyAI-Image-Edit, repackaged for native ComfyUI support (no custom node required).
JoyAI-Image-Edit is the single-image instruction-guided editing model of the JoyAI-Image family. It takes one reference image plus a text instruction and generates the edited result.
Files
| File | Size | Goes into | Component |
|---|---|---|---|
diffusion_models/joy_image_edit_bf16.safetensors |
~31 GB | ComfyUI/models/diffusion_models/ |
JoyImageEditTransformer3DModel (bf16) |
text_encoders/qwen3vl_joyimage_bf16.safetensors |
~17 GB | ComfyUI/models/text_encoders/ |
Qwen3-VL-8B text encoder (bf16) |
vae/joy_image_edit_vae.safetensors |
~243 MB | ComfyUI/models/vae/ |
AutoencoderKLWan |
The repo's directory layout already matches ComfyUI/models/, so a single hf download into your models root drops every file where it needs to go.
Installation
The model runs natively in ComfyUI. Native support is proposed upstream in Comfy-Org/ComfyUI#14428; until it is merged, install the fork branch:
git clone -b joyimage-edit-pr https://github.com/feice-huang/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
Once the PR is merged upstream, the stock ComfyUI release will run these weights with no fork needed.
Then download the weights straight into ComfyUI/models/:
hf download jdopensource/JoyAI-Image-Edit-ComfyUI \
--local-dir /path/to/ComfyUI/models
Restart ComfyUI.
Usage
Build the graph from these native nodes:
- Load Diffusion Model (
UNETLoader) →diffusion_models/joy_image_edit_bf16.safetensors - Load CLIP (
CLIPLoader) →text_encoders/qwen3vl_joyimage_bf16.safetensors, typejoyimage - Load VAE (
VAELoader) →vae/joy_image_edit_vae.safetensors - Load Image (
LoadImage) for the reference - TextEncodeJoyImageEdit — feed
clip,vae, the instruction, and the referenceimage. Wire one instance for the positive prompt and one (empty prompt, same image) for the negative. The node bucket-resizes the reference to the 1024-base buckets, VAE-encodes it, and appends the reference latent to the conditioning; itsimageoutput feedsVAEDecode/ empty-latent sizing. - KSampler → VAEDecode → SaveImage
Example workflow: workflow_joyimage_edit.json
Recommended parameters
| Parameter | Value |
|---|---|
| Steps | 40 |
| CFG | 4.0 |
| Sampler | euler |
| Scheduler | simple |
| dtype | bf16 |
| Resolution | auto (1024-base buckets) |
GGUF quantizations
Lower-bit GGUF quants of the transformer and text encoder are available at huangfeice/JoyAI-Image-Edit-Diffusers-GGUF (community contribution). The VAE here is the only VAE you need — GGUF doesn't quantize the VAE.
Links
- Source code and documentation: github.com/jd-opensource/JoyAI-Image
- Original Diffusers-format weights: jdopensource/JoyAI-Image-Edit-Diffusers
- Multi-image edit model (ComfyUI): jdopensource/JoyAI-Image-Edit-Plus-ComfyUI