WinDiNet: Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows

WinDiNet repurposes a 2-billion-parameter video diffusion transformer (LTX-Video) as a fast, differentiable surrogate for computational fluid dynamics (CFD) simulations of urban wind patterns. Fine-tuned on 10,000 CFD simulations across procedurally generated building layouts, it generates complete 112-frame wind field rollouts in under one second — over 2,000x faster than the ground truth CFD solver.

Physics-informed VAE decoder: Fine-tuned with incompressibility and wall boundary losses for physically consistent velocity field reconstruction
Scalar conditioning: Fourier-feature-encoded inlet speed and domain size replace text prompts, enabling precise physical parametrisation
Differentiable end-to-end: Enables gradient-based inverse design of urban building layouts for pedestrian wind comfort
State-of-the-art accuracy: Outperforms specialised neural operators (FNO, OFormer, Poseidon, U-Net) on vRMSE, spectral divergence, and Wasserstein distance

Model Weights

This repository contains three checkpoint files:

File	Description	Parameters	Size
`dit.safetensors`	Fine-tuned diffusion transformer	1.92B	7.7 GB
`scalar_embedding.safetensors`	Fourier feature scalar conditioning module	4.3M	17 MB
`vae_decoder.safetensors`	Physics-informed VAE decoder	553M	2.2 GB

Download

Checkpoints are downloaded automatically when using the windinet package. For manual download:

# Using Hugging Face CLI
huggingface-cli download rabischof/windinet --local-dir checkpoints/

# Or individual files
huggingface-cli download rabischof/windinet dit.safetensors
huggingface-cli download rabischof/windinet scalar_embedding.safetensors
huggingface-cli download rabischof/windinet vae_decoder.safetensors

Installation

git clone https://github.com/rbischof/windinet.git
cd windinet
pip install -e .

Inference

Each input sample is a building footprint PNG (black=building, white=fluid) paired with a JSON file specifying inlet conditions:

{"inlet_speed_mps": 10.0, "field_size_m": 1400}

Run inference:

python scripts/inference.py configs/inference.yaml \
    --input_dir examples/footprints/ \
    --out_dir predictions/

Outputs per sample: .npz (u/v velocity fields in m/s, float16) and .mp4 (wind magnitude video).

Training

WinDiNet training has two stages:

Stage 1: VAE decoder fine-tuning with physics-informed losses (incompressibility + wall boundary conditions):

python scripts/finetune_vae.py configs/finetune_vae.yaml

Stage 2: Diffusion transformer training with scalar conditioning:

python scripts/train.py configs/windinet_scalar.yaml

See the GitHub repository for dataset preparation and configuration details.

Inverse Design

WinDiNet serves as a differentiable surrogate for gradient-based optimisation of building layouts:

python scripts/inverse_design.py configs/inverse_opt.yaml

The optimizer adjusts building positions to minimise a Pedestrian Wind Comfort (PWC) loss. The framework is extensible to custom objectives and building parametrisations — see inverse/objective.py and inverse/footprint.py.

Citation

@article{perini2025windinet,
  title={Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows},
  author={Perini, Janne and Bischof, Rafael and Arar, Moab and Duran, Ay{\c{c}}a and Kraus, Michael A. and Mishra, Siddhartha and Bickel, Bernd},
  journal={arXiv preprint arXiv:2603.21210},
  year={2026}
}

Details

License: Apache 2.0
Base model: LTX-Video 2B v0.9.6
Training data: 10,000 CFD simulations (256x256, 112 frames each)
arXiv: 2603.21210
Authors: Janne Perini*, Rafael Bischof*, Moab Arar, Ayca Duran, Michael A. Kraus, Siddhartha Mishra, Bernd Bickel (* equal contribution)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for rabischof/windinet

Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows

Paper • 2603.21210 • Published Mar 22 • 1