| | --- |
| | base_model: |
| | - timbrooks/instruct-pix2pix |
| | - SherryXTChen/Instruct-CLIP |
| | - SherryXTChen/LatentDiffusionDINOv2 |
| | datasets: |
| | - SherryXTChen/InstructCLIP-InstructPix2Pix-Data |
| | language: |
| | - en |
| | library_name: diffusers |
| | license: apache-2.0 |
| | tags: |
| | - stable-diffusion |
| | - stable-diffusion-diffusers |
| | - text-to-image |
| | - diffusers |
| | - diffusers-training |
| | - image-to-image |
| | inference: true |
| | pipeline_tag: image-to-image |
| | --- |
| | |
| | # InstructCLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning |
| |
|
| | The model is based on the paper [Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning](https://huggingface.co/papers/2503.18406). |
| | GitHub: https://github.com/SherryXTChen/Instruct-CLIP.git |
| |
|
| | ## Example |
| |
|
| | ```python |
| | import PIL |
| | import requests |
| | import torch |
| | from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler |
| | |
| | model_id = "timbrooks/instruct-pix2pix" |
| | pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16) |
| | pipe.load_lora_weights("SherryXTChen/InstructCLIP-InstructPix2Pix") |
| | pipe.to("cuda") |
| | pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config) |
| | |
| | url = "https://raw.githubusercontent.com/SherryXTChen/Instruct-CLIP/refs/heads/main/assets/1_input.jpg" |
| | def download_image(url): |
| | image = PIL.Image.open(requests.get(url, stream=True).raw) |
| | image = PIL.ImageOps.exif_transpose(image) |
| | image = image.convert("RGB") |
| | return image |
| | image = download_image(url) |
| | |
| | prompt = "as a 3 d sculpture" |
| | images = pipe(prompt, image=image, num_inference_steps=20).images |
| | images[0].save("output.jpg") |
| | ``` |