Instructions to use BiliSakura/PixNerd-diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use BiliSakura/PixNerd-diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("BiliSakura/PixNerd-diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| license: mit | |
| library_name: diffusers | |
| tags: | |
| - diffusers | |
| - image-generation | |
| - class-conditional | |
| - imagenet | |
| - pixnerd | |
| language: | |
| - en | |
| # PixNerd-XL-16 Diffusers Checkpoints | |
| Production-ready Diffusers export of PixNerd-XL/16 class-conditional ImageNet checkpoints. | |
| ## Available Checkpoints | |
| - `PixNerd-XL-16-256` | |
| - source: `epoch%3D319-step%3D1600000_emainit.ckpt` | |
| - target resolution: `256x256` | |
| - `PixNerd-XL-16-512` | |
| - source: `res512_ft200k_epoch%3D325-step%3D1800000_emainit.ckpt` | |
| - target resolution: `512x512` | |
| Both checkpoints are packaged with: | |
| - `pipeline.py` | |
| - `modeling_pixnerd_transformer_2d.py` | |
| - `scheduling_pixnerd_flow_match.py` | |
| - `transformer/` weights + config | |
| - `scheduler/` config | |
| ## Requirements | |
| ```bash | |
| pip install torch diffusers | |
| ``` | |
| ## Inference (Python) | |
| ```python | |
| import torch | |
| from diffusers import DiffusionPipeline | |
| model_dir = "PixNerd-XL-16-256" # or PixNerd-XL-16-512 | |
| pipe = DiffusionPipeline.from_pretrained( | |
| model_dir, | |
| custom_pipeline=f"{model_dir}/pipeline.py", | |
| torch_dtype=torch.float32, | |
| ).to("cpu") # use "cuda" if available | |
| # Class-conditional generation: class label 207 (golden retriever) | |
| images = pipe( | |
| prompt=[207], | |
| num_images_per_prompt=1, | |
| height=256, | |
| width=256, | |
| num_inference_steps=25, | |
| guidance_scale=4.0, | |
| timeshift=3.0, | |
| order=2, | |
| ).images | |
| images[0].save("sample.png") | |
| ``` | |
| ## Interface Notes | |
| - The pipeline uses `prompt` for conditioning input. | |
| - For class-conditional generation, pass integer labels, e.g. `prompt=[207]`. | |
| - `height` and `width` should match checkpoint intent (256 or 512), but custom sizes work if divisible by patch size. | |
| ## Reproducibility Metadata | |
| - Architecture and conversion provenance are recorded in each checkpoint's `conversion_metadata.json`. | |
| - Transformer and scheduler runtime classes are defined in repository-local Python modules shipped with each checkpoint. | |
| ## Limitations | |
| - Intended for ImageNet class-conditional generation. | |
| - No text encoder is included. | |
| - Output quality depends on scheduler settings and inference step count. | |
| ## Citation | |
| Source paper (ICLR 2026): | |
| - [PixNerd: Pixel Neural Field Diffusion](http://arxiv.org/abs/2507.23268) | |
| - [Hugging Face Papers page](https://huggingface.co/papers/2507.23268) | |
| Source code: | |
| - Original PixNerd codebase: [MCG-NJU/PixNerd](https://github.com/MCG-NJU/PixNerd) | |
| - Diffusers conversion code used for this export: [Bili-Sakura/PixNerd-diffusers](https://github.com/Bili-Sakura/PixNerd-diffusers) | |
| ```bibtex | |
| @article{2507.23268, | |
| Author = {Shuai Wang and Ziteng Gao and Chenhui Zhu and Weilin Huang and Limin Wang}, | |
| Title = {PixNerd: Pixel Neural Field Diffusion}, | |
| Year = {2025}, | |
| Eprint = {arXiv:2507.23268}, | |
| } | |
| ``` | |