StyleID โ€” Stylization-Agnostic Identity Encoder

arXiv Project Page

StyleID is a CLIP-based image encoder trained to produce identity embeddings that are robust to stylization.
It can be used for identity similarity, retrieval, evaluation, and conditioning in generative models.


Installation

pip install transformers pillow

Usage

Do not use for multiple faces or faces too small to recognize.

import torch
from transformers import CLIPModel, CLIPProcessor
from PIL import Image

device = "cuda" if torch.cuda.is_available() else "cpu"

model = CLIPModel.from_pretrained("kwanY/styleid").to(device)
processor = CLIPProcessor.from_pretrained("kwanY/styleid")

img = Image.open(img_path).convert("RGB")
inputs = processor(images=img, return_tensors="pt").to(device)

with torch.no_grad():
    emb = model.get_image_features(**inputs)
    emb = emb / emb.norm(dim=-1, keepdim=True)  # optional but recommended

Open for non-commercial research. Do not use FFHQ for biometric human recognition

Downloads last month
44
Safetensors
Model size
0.4B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Paper for kwanY/styleid