Text-to-Image
Sana
Diffusers
Safetensors
English
Chinese
Sana
1024px_based_image_size
Multi-language
Instructions to use Efficient-Large-Model/Sana_1600M_1024px_diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Sana
How to use Efficient-Large-Model/Sana_1600M_1024px_diffusers with Sana:
# Load the model and infer image from text import torch from app.sana_pipeline import SanaPipeline from torchvision.utils import save_image sana = SanaPipeline("configs/sana_config/1024ms/Sana_1600M_img1024.yaml") sana.from_pretrained("hf://Efficient-Large-Model/Sana_1600M_1024px_diffusers") image = sana( prompt='a cyberpunk cat with a neon sign that says "Sana"', height=1024, width=1024, guidance_scale=5.0, pag_guidance_scale=2.0, num_inference_steps=18, ) - Notebooks
- Google Colab
- Kaggle
This model doesn't work correctly with the default pipeline
#3
by frutiemax - opened
I've finetuned a model based on this checkpoint and I've noticed the image get really noisy when introducing some elements that were not part of my dataset (and were part of the base checkpoint). I then tried to generate images with this checkpoint to see it doesn't work with the default pipeline i.e. the image is screwed up.
I then tried Efficient-Large-Model/Sana_1600M_1024px_MultiLing_diffusers, and this one works perfectly fine! Either update this model to work with the default pipeline or remove it.
Thanks man. The problem is caused by diffusers code. I pull a PR to fix it.
https://github.com/huggingface/diffusers/pull/10431