🔥 News

[2026/05/18] Release inference code, model weights and SA-Z dataset.
[2026/05/18] Release OcclusionFormer open-source package in this repository.
[2026/4/30] OcclusionFormer is accepted to ICML 2026.

😊 Introduction

OcclusionFormer addresses a core challenge in layout-to-image generation: when multiple bounding boxes overlap, standard methods often produce entangled textures and incorrect front/back ordering.

From the paper, OcclusionFormer introduces explicit Z-order modeling for layout-grounded generation by:

decoupling instance generation,
arranging occlusion order with a volume-rendering-inspired transmittance mechanism,
and enforcing spatial precision with a queried alignment objective.

The paper also introduces SA-Z, a large-scale dataset with explicit occlusion order and amodal supervision for occlusion-aware layout generation.

🔧 Key Features

SA-Z Dataset Curation: Enriches layout annotations with instance captions, explicit occlusion order, and amodal signals.
Occlusion-Aware DiT Framework: Models Z-order dependencies explicitly rather than mixing overlapping instances implicitly.
Instance Decoupling + Volumetric Composition: Improves robustness on dense overlap scenes by composing instances with transmittance-based ordering.
Queried Alignment Mechanism: Improves spatial faithfulness and local semantic consistency.

💻 Quick Start

Environment setup

cd OcclusionFormer
conda create -n OcclusionFormer python=3.11 -y
conda activate OcclusionFormer

Install requirements

pip install --upgrade -r requirements.txt

Download checkpoint

https://huggingface.co/FudanCVL/OcclusionFormer

Run Streamlit demo

streamlit run demo_occlusionformer.py

Run CLI inference

python inference_occlusionformer.py \
  --model_path /path/to/FLUX.1-dev \
  --ckpt_path /path/to/occlusionformer_checkpoint_dir \
  --layout_json ./examples/livingroom.json \
  --output_dir ./outputs_occlusionformer \
  --enable_layout \
  --overwrite

Batch inference with a directory of JSON layouts:

python inference_occlusionformer.py \
  --model_path /path/to/FLUX.1-dev \
  --ckpt_path /path/to/occlusionformer_checkpoint_dir \
  --layout_dir ./examples \
  --output_dir ./outputs_occlusionformer \
  --enable_layout \
  --overwrite

✅ TODO

Organize and update the Amodal annotation on Hugging Face.

📁 Repository Scope

This folder provides a standalone inference/demo package:

demo_occlusionformer.py: Streamlit demo UI
inference_occlusionformer.py: CLI inference
src/occlusionformer/: OcclusionFormer core modules
src/utils.py, src/transformer_utils.py: required utility modules
examples/: example layout JSON files
requirements.txt: runtime dependencies

⚙️ Inference Notes

The demo and CLI follow the current project preprocessing logic and compose prompts using global prompt + instance captions.
Layout control is enabled via --enable_layout (or disabled with --disable_layout).
Outputs include generated images and layout overlays for visualization.

👍 Acknowledgement

This work is built on many amazing research works and open-source projects. We thank the authors for sharing!

💗 Citation

@inproceedings{li2026occlusionformer,
  title={OcclusionFormer: Arranging Z-Order for Layout-Grounded Image Generation},
  author={Li, Ziye and Ding, Henghui},
  booktitle={ICML},
  year={2026}
}

Downloads last month: -