Diffusers
Safetensors
BiliSakura's picture
Upload folder using huggingface_hub
c5cfae9 verified
|
Raw
History Blame Contribute Delete
9.46 kB
# IntrinsicWeather (Diffusers)
Diffusers-format checkpoint for **[IntrinsicWeather: Controllable Weather Editing in Intrinsic Space](https://arxiv.org/pdf/2508.06982v6)** (CVPR 2026 Highlight).
This repo bundles inverse rendering, forward weather rendering, and the IMAA gating module into a single Hugging Face–compatible layout. Shared Stable Diffusion 3 components (VAE, text encoders, tokenizers, scheduler) are stored once; task-specific transformers live under `transformer/<variant>/`.
## Model layout
```
IntrisicWeather-diffusers/
├── dinov2/ # bundled DINOv2 weights (for IMAA / decomposition)
├── imaa/ # Intrinsic Map-Aware Attention weights
├── text_encoder/, text_encoder_2/, text_encoder_3/
├── tokenizer/, tokenizer_2/, tokenizer_3/
├── vae/, scheduler/
├── transformer/
│ ├── inverse-512/ # IntrinsicWeatherSD3Transformer2DModel (in_channels=32)
│ │ └── transformer_intrinsic_weather.py
│ └── forward/ # SD3Transformer2DModel (in_channels=96)
│ └── lora/ # forward-renderer LoRA (loaded by default)
├── pipeline_intrinsic_weather.py # unified: RGB → maps → weather RGB
├── pipeline_intrinsic_weather_inverse.py # inverse only
├── pipeline_intrinsic_weather_forward.py # forward only
├── pipeline_utils.py
├── model_index.json
├── convert_inverse_renderer_512.py
├── convert_forward_renderer.py
└── test_all_pipelines.py
```
| Component | Source | Notes |
|-----------|--------|-------|
| Inverse transformer | [GilgameshYX/InverseRenderer-512](https://huggingface.co/GilgameshYX/InverseRenderer-512) | 512×512 decomposition |
| Forward transformer + LoRA | [GilgameshYX/ForwardRenderer](https://huggingface.co/GilgameshYX/ForwardRenderer) | LoRA in `transformer/forward/lora/` |
| IMAA | InverseRenderer-512 `imaa.pth` | Required for map-aware inverse attention |
| SD3 shared weights | `stabilityai/stable-diffusion-3-medium-diffusers` | VAE + text encoders only |
| Transformer config | `stabilityai/stable-diffusion-3.5-medium` | Architecture template for weight loading |
## Requirements
- Python 3.10+
- CUDA GPU recommended (~20 GB VRAM for full end-to-end inference at 512×512)
- `torch`, `diffusers>=0.38`, `transformers`, `safetensors`, `torchvision`, `Pillow`
```bash
pip install torch diffusers transformers safetensors torchvision pillow accelerate
```
## Quick start (end-to-end weather edit)
The unified pipeline decomposes an input RGB image into intrinsic maps, then renders a weather-conditioned result. **DINOv2** is required for decomposition (bundled under `dinov2/`, or use `facebook/dinov2-base` from Hugging Face).
```python
from pathlib import Path
import torch
from PIL import Image
from transformers import AutoImageProcessor, AutoModel
from pipeline_intrinsic_weather import IntrinsicWeatherPipeline
repo_dir = Path(".").resolve() # path to this folder
device = "cuda"
dtype = torch.bfloat16
pipe = IntrinsicWeatherPipeline.from_pretrained(
repo_dir,
inverse_transformer_subfolder="inverse-512",
forward_transformer_subfolder="forward",
device=device,
local_files_only=True,
torch_dtype=dtype,
load_lora=True,
load_imaa=True,
)
dino_path = repo_dir / "dinov2"
dino_processor = AutoImageProcessor.from_pretrained(dino_path, local_files_only=True)
dino_model = AutoModel.from_pretrained(dino_path, local_files_only=True).to(device)
dino_model.eval()
image = Image.open("input.png").convert("RGB")
result = pipe(
image=image,
weather="snowy", # rainy | sunny | snowy | foggy | overcast | night
dino_model=dino_model,
dino_processor=dino_processor,
image_size=512,
render_size=512,
num_inverse_steps=50,
num_forward_steps=50,
guidance_scale=6.0,
image_guidance_scale=1.5,
generator=torch.Generator(device=device).manual_seed(42),
)
result.images[0].save("output_snowy.png")
```
Run from inside this directory (or add it to `PYTHONPATH`) so `pipeline_intrinsic_weather.py` and `imaa/` resolve correctly.
## Pipelines
### 1. `IntrinsicWeatherPipeline` (unified)
Full pipeline: **RGB → intrinsic maps → weather RGB**.
```python
pipe = IntrinsicWeatherPipeline.from_pretrained(
repo_dir,
inverse_transformer_subfolder="inverse-512",
forward_transformer_subfolder="forward",
device="cuda",
torch_dtype=torch.bfloat16,
)
```
Useful kwargs:
| Argument | Default | Description |
|----------|---------|-------------|
| `inverse_transformer_subfolder` | `"inverse-512"` | Inverse transformer under `transformer/` |
| `forward_transformer_subfolder` | `"forward"` | Forward transformer under `transformer/` |
| `load_lora` | `True` | Load LoRA from `transformer/forward/lora/` |
| `load_imaa` | `True` | Load IMAA weights from `imaa/` |
| `device` | `None` | Moves all modules to device (IMAA stays float32) |
Sub-methods:
- `pipe.decompose(image, dino_model, dino_processor, ...)` → dict of intrinsic maps
- `pipe.render(maps, weather="rainy", ...)` → weather-conditioned RGB
### 2. `IntrinsicWeatherInversePipeline`
Inverse rendering only (single intrinsic map per call).
```python
from pipeline_intrinsic_weather_inverse import IntrinsicWeatherInversePipeline
pipe = IntrinsicWeatherInversePipeline.from_pretrained(
repo_dir,
transformer_subfolder="inverse-512",
device="cuda",
torch_dtype=torch.bfloat16,
)
```
Load the transformer separately if needed:
```python
transformer = IntrinsicWeatherInversePipeline.load_transformer(
"inverse-512", repo_dir, device="cuda"
)
pipe = IntrinsicWeatherInversePipeline.from_pretrained(
repo_dir, transformer=transformer, device="cuda"
)
```
IMAA and DINO are used by the unified pipeline’s `decompose()` path; for standalone inverse calls, pass `map_aware_mask` from IMAA manually (see `test_all_pipelines.py`).
### 3. `IntrinsicWeatherForwardPipeline`
Forward weather rendering from intrinsic maps.
```python
from pipeline_intrinsic_weather_forward import IntrinsicWeatherForwardPipeline
pipe = IntrinsicWeatherForwardPipeline.from_pretrained(
repo_dir,
transformer_subfolder="forward",
device="cuda",
torch_dtype=torch.bfloat16,
load_lora=True,
)
```
LoRA weights are read from `transformer/forward/lora/` when `load_lora=True`.
## Weather presets
Built-in weather keys (or pass a custom prompt string):
| Key | Prompt |
|-----|--------|
| `rainy` | A rainy day. |
| `sunny` | A sunny day. |
| `snowy` | A snowy day. |
| `foggy` | A foggy day. |
| `overcast` | An overcast day. |
| `night` | A night scene. |
## Intrinsic maps (AoVs)
The inverse renderer produces five appearance-of-variety maps:
`albedo`, `normal`, `roughness`, `metallic`, `irradiance`
## Loading transformers manually
Transformers are stored per variant under `transformer/<subfolder>/`. Use `pipeline_utils.load_transformer_from_subfolder`:
```python
from pipeline_utils import load_transformer_from_subfolder, load_transformer_lora
inverse = load_transformer_from_subfolder(repo_dir, "inverse-512", device="cuda")
forward = load_transformer_from_subfolder(repo_dir, "forward", device="cuda")
```
- `inverse-512` uses a custom `IntrinsicWeatherSD3Transformer2DModel` (`in_channels=32`).
- `forward` uses the standard `SD3Transformer2DModel` (`in_channels=96`).
## Dtype and device notes
- Default dtype is **`torch.bfloat16`** for transformers, VAE, and text encoders.
- **IMAA** stays in **float32** (DINO patch tokens are float32).
- Pass `device="cuda"` to `from_pretrained` on all three pipeline classes; the unified pipeline moves every registered module to the target device automatically.
## Testing
Smoke-test all pipelines on CUDA:
```bash
python test_all_pipelines.py
```
Runs 2-step inverse, forward (with LoRA), and unified load checks with `bfloat16`.
## Re-converting from original checkpoints
If you have the raw GilgameshYX checkpoints:
```bash
# Inverse renderer (512) + IMAA
python convert_inverse_renderer_512.py
# Forward renderer + LoRA
python convert_forward_renderer.py
```
See `conversion_metadata.json` and `conversion_metadata_forward.json` for source paths used during conversion.
## Hugging Face Hub loading
When published to the Hub, load with `trust_remote_code=True`:
```python
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained(
"BiliSakura/IntrisicWeather-diffusers",
custom_pipeline="pipeline_intrinsic_weather.py",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
```
For local use, importing `IntrinsicWeatherPipeline` directly (as in Quick start) is simpler and avoids Hub cache path issues with custom modules.
## References
- **Paper:** [IntrinsicWeather (arXiv:2508.06982)](https://arxiv.org/pdf/2508.06982v6)
- **Project page:** https://yixinzhu042.github.io/IntrinsicWeather/
- **Upstream diffusers repo:** [IntrinsicWeather-diffusers](https://github.com/YixinZhu042/IntrinsicWeather)
- **Original weights:** [GilgameshYX/InverseRenderer-512](https://huggingface.co/GilgameshYX/InverseRenderer-512), [GilgameshYX/ForwardRenderer](https://huggingface.co/GilgameshYX/ForwardRenderer)
## License
Weights and code follow the licenses of the upstream IntrinsicWeather project and the Stable Diffusion 3 components used for shared modules.