CompVis
/

cleandift

Image Feature Extraction

Diffusion Single File

Model card Files Files and versions

cleandift / README.md

qwertyforce's picture

fix website link

13cd447 verified about 1 year ago

|

2.12 kB

	---
	pipeline_tag: image-feature-extraction
	license: mit
	library_name: diffusion-single-file
	---

	# CleanDIFT Model Card


	Diffusion models learn powerful world representations that have proven valuable for tasks like semantic correspondence detection,
	depth estimation, semantic segmentation, and classification.
	However, diffusion models require noisy input images, which destroys information and introduces the noise level as a hyperparameter that needs to be tuned for each task.




	We introduce CleanDIFT, a novel method to extract noise-free, timestep-independent features by enabling diffusion models to work directly with clean input images.
	The approach is efficient, training on a single GPU in just 30 minutes. We publish these models alongside our paper ["CleanDIFT: Diffusion Features without Noise"](https://compvis.github.io/cleandift/).

	We provide checkpoints for Stable Diffusion 1.5 and Stable Diffusion 2.1.


	## Usage

	For detailed examples on how to extract features with CleanDIFT and how to use them for downstream tasks, please refer to the notebooks provided [here](https://github.com/CompVis/CleanDIFT/tree/main/notebooks).

	Our checkpoints are fully compatible with the `diffusers` library.
	If you already have a pipeline using SD 1.5 or SD 2.1 from `diffusers`, you can simply replace the U-Net state dict:
	```python
	from diffusers import UNet2DConditionModel
	from huggingface_hub import hf_hub_download

	unet = UNet2DConditionModel.from_pretrained("stabilityai/stable-diffusion-2-1", subfolder="unet")
	ckpt_pth = hf_hub_download(repo_id="CompVis/cleandift", filename="cleandift_sd21_unet.safetensors")
	state_dict = load_file(ckpt_pth)
	unet.load_state_dict(state_dict, strict=True)
	```

	## Citation

	```bibtex
	@misc{stracke2024cleandiftdiffusionfeaturesnoise,
	title={CleanDIFT: Diffusion Features without Noise},
	author={Nick Stracke and Stefan Andreas Baumann and Kolja Bauer and Frank Fundel and Björn Ommer},
	year={2024},
	eprint={2412.03439},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2412.03439},
	}
	```