Diffusers
Safetensors
PixCellPipeline
pixcell_1024_banner

PixCell: A generative foundation model for digital histopathology images

[๐Ÿ“„ arXiv] [GitHub] [๐Ÿ”ฌ PixCell-1024] [๐Ÿ”ฌ PixCell-256] [๐Ÿ”ฌ Pixcell-256-Cell-ControlNet] [๐Ÿ’พ Synthetic-TCGA-10M]

Load PixCell-1024 model

import torch

from diffusers import DiffusionPipeline
from diffusers import AutoencoderKL

device = torch.device('cuda')

# We do not host the weights of the SD3 VAE -- load it from StabilityAI
sd3_vae = AutoencoderKL.from_pretrained("stabilityai/stable-diffusion-3.5-large", subfolder="vae")

pipeline = DiffusionPipeline.from_pretrained(
    "StonyBrook-CVLab/PixCell-1024",
    vae=sd3_vae,
    custom_pipeline="StonyBrook-CVLab/PixCell-pipeline",
    trust_remote_code=True,
    torch_dtype=torch.float16,
)

pipeline.to(device);

Load [UNI-2h] for conditioning

import timm
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform

timm_kwargs = {
            'img_size': 224, 
            'patch_size': 14, 
            'depth': 24,
            'num_heads': 24,
            'init_values': 1e-5, 
            'embed_dim': 1536,
            'mlp_ratio': 2.66667*2,
            'num_classes': 0, 
            'no_embed_class': True,
            'mlp_layer': timm.layers.SwiGLUPacked, 
            'act_layer': torch.nn.SiLU, 
            'reg_tokens': 8, 
            'dynamic_img_size': True
        }
uni_model = timm.create_model("hf-hub:MahmoodLab/UNI2-h", pretrained=True, **timm_kwargs)
transform = create_transform(**resolve_data_config(uni_model.pretrained_cfg, model=uni_model))
uni_model.eval()
uni_model.to(device);

Unconditional generation

uncond = pipeline.get_unconditional_embedding(1)
with torch.amp.autocast('cuda'):
    samples = pipeline(uni_embeds=uncond, negative_uni_embeds=None, guidance_scale=1.0)

Conditional generation

# Load image
import numpy as np
import einops
from PIL import Image
from huggingface_hub import hf_hub_download

# This is an example image we provide
path = hf_hub_download(repo_id="StonyBrook-CVLab/PixCell-1024", filename="test_image.png")
image = Image.open(path).convert("RGB")


# Rearrange 1024x1024 image into 16 256x256 patches
uni_patches = np.array(image)
uni_patches = einops.rearrange(uni_patches, '(d1 h) (d2 w) c -> (d1 d2) h w c', d1=4, d2=4)
uni_input = torch.stack([transform(Image.fromarray(item)) for item in uni_patches])

# Extract UNI embeddings
with torch.inference_mode():
    uni_emb = uni_model(uni_input.to(device))

# reshape UNI to (bs, 16, D)
uni_emb = uni_emb.unsqueeze(0)
print("Extracted UNI:", uni_emb.shape)

# Get unconditional embedding for classifier-free guidance
uncond = pipeline.get_unconditional_embedding(uni_emb.shape[0])
# Generate new samples
with torch.amp.autocast('cuda'):
    samples = pipeline(uni_embeds=uni_emb, negative_uni_embeds=uncond, guidance_scale=1.5).images

License & Usage

License: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

Notice: This model is a derivative work conditioned on embeddings from the [UNI-2h] foundation model. As such, it is subject to the original terms of the UNI2 license.

  • Academic & Research Use Only: You may use these weights for non-commercial research purposes.
  • No Commercial Use: You may not use this model for any commercial purpose, including product development or commercial services.
Downloads last month
1,646
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including StonyBrook-CVLab/PixCell-1024

Paper for StonyBrook-CVLab/PixCell-1024