Logo OcclusionFormer: Arranging Z-Order
for Layout-Grounded Image Generation
                   

Ziye Li, Henghui Dingβœ‰

Fudan University

ICML 2026

βœ‰ Corresponding Author

πŸ”₯ News

  • [2026/05/18] Release inference code, model weights and SA-Z dataset.
  • [2026/05/18] Release OcclusionFormer open-source package in this repository.
  • [2026/4/30] OcclusionFormer is accepted to ICML 2026.

😊 Introduction

teaser OcclusionFormer addresses a core challenge in layout-to-image generation: when multiple bounding boxes overlap, standard methods often produce entangled textures and incorrect front/back ordering.

From the paper, OcclusionFormer introduces explicit Z-order modeling for layout-grounded generation by:

  • decoupling instance generation,
  • arranging occlusion order with a volume-rendering-inspired transmittance mechanism,
  • and enforcing spatial precision with a queried alignment objective.

The paper also introduces SA-Z, a large-scale dataset with explicit occlusion order and amodal supervision for occlusion-aware layout generation.


πŸ”§ Key Features

  • SA-Z Dataset Curation: Enriches layout annotations with instance captions, explicit occlusion order, and amodal signals. dataset
  • Occlusion-Aware DiT Framework: Models Z-order dependencies explicitly rather than mixing overlapping instances implicitly.
  • Instance Decoupling + Volumetric Composition: Improves robustness on dense overlap scenes by composing instances with transmittance-based ordering.
  • Queried Alignment Mechanism: Improves spatial faithfulness and local semantic consistency. pipeline

πŸ’» Quick Start

  1. Environment setup
cd OcclusionFormer
conda create -n OcclusionFormer python=3.11 -y
conda activate OcclusionFormer
  1. Install requirements
pip install --upgrade -r requirements.txt
  1. Download checkpoint
https://huggingface.co/FudanCVL/OcclusionFormer
  1. Run Streamlit demo
streamlit run demo_occlusionformer.py
  1. Run CLI inference
python inference_occlusionformer.py \
  --model_path /path/to/FLUX.1-dev \
  --ckpt_path /path/to/occlusionformer_checkpoint_dir \
  --layout_json ./examples/livingroom.json \
  --output_dir ./outputs_occlusionformer \
  --enable_layout \
  --overwrite

Batch inference with a directory of JSON layouts:

python inference_occlusionformer.py \
  --model_path /path/to/FLUX.1-dev \
  --ckpt_path /path/to/occlusionformer_checkpoint_dir \
  --layout_dir ./examples \
  --output_dir ./outputs_occlusionformer \
  --enable_layout \
  --overwrite

βœ… TODO

  • Organize and update the Amodal annotation on Hugging Face.

πŸ“ Repository Scope

This folder provides a standalone inference/demo package:

  • demo_occlusionformer.py: Streamlit demo UI
  • inference_occlusionformer.py: CLI inference
  • src/occlusionformer/: OcclusionFormer core modules
  • src/utils.py, src/transformer_utils.py: required utility modules
  • examples/: example layout JSON files
  • requirements.txt: runtime dependencies

βš™οΈ Inference Notes

  • The demo and CLI follow the current project preprocessing logic and compose prompts using global prompt + instance captions.
  • Layout control is enabled via --enable_layout (or disabled with --disable_layout).
  • Outputs include generated images and layout overlays for visualization.

πŸ‘ Acknowledgement

This work is built on many amazing research works and open-source projects. We thank the authors for sharing!


πŸ’— Citation

@inproceedings{li2026occlusionformer,
  title={OcclusionFormer: Arranging Z-Order for Layout-Grounded Image Generation},
  author={Li, Ziye and Ding, Henghui},
  booktitle={ICML},
  year={2026}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support