PixNerd-XL-16 Diffusers Checkpoints

Production-ready Diffusers export of PixNerd-XL/16 class-conditional ImageNet checkpoints.

Available Checkpoints

  • PixNerd-XL-16-256
    • source: epoch%3D319-step%3D1600000_emainit.ckpt
    • target resolution: 256x256
  • PixNerd-XL-16-512
    • source: res512_ft200k_epoch%3D325-step%3D1800000_emainit.ckpt
    • target resolution: 512x512

Both checkpoints are packaged with:

  • pipeline.py
  • modeling_pixnerd_transformer_2d.py
  • scheduling_pixnerd_flow_match.py
  • transformer/ weights + config
  • scheduler/ config

Requirements

pip install torch diffusers

Inference (Python)

import torch
from diffusers import DiffusionPipeline

model_dir = "PixNerd-XL-16-256"  # or PixNerd-XL-16-512
pipe = DiffusionPipeline.from_pretrained(
    model_dir,
    custom_pipeline=f"{model_dir}/pipeline.py",
    torch_dtype=torch.float32,
).to("cpu")  # use "cuda" if available

# Class-conditional generation: class label 207 (golden retriever)
images = pipe(
    prompt=[207],
    num_images_per_prompt=1,
    height=256,
    width=256,
    num_inference_steps=25,
    guidance_scale=4.0,
    timeshift=3.0,
    order=2,
).images

images[0].save("sample.png")

Interface Notes

  • The pipeline uses prompt for conditioning input.
  • For class-conditional generation, pass integer labels, e.g. prompt=[207].
  • height and width should match checkpoint intent (256 or 512), but custom sizes work if divisible by patch size.

Reproducibility Metadata

  • Architecture and conversion provenance are recorded in each checkpoint's conversion_metadata.json.
  • Transformer and scheduler runtime classes are defined in repository-local Python modules shipped with each checkpoint.

Limitations

  • Intended for ImageNet class-conditional generation.
  • No text encoder is included.
  • Output quality depends on scheduler settings and inference step count.

Citation

Source paper (ICLR 2026):

Source code:

@article{2507.23268,
  Author = {Shuai Wang and Ziteng Gao and Chenhui Zhu and Weilin Huang and Limin Wang},
  Title = {PixNerd: Pixel Neural Field Diffusion},
  Year = {2025},
  Eprint = {arXiv:2507.23268},
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including BiliSakura/PixNerd-diffusers

Paper for BiliSakura/PixNerd-diffusers