Instructions to use 8BitStudio/Aniimage-1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use 8BitStudio/Aniimage-1 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("8BitStudio/Aniimage-1", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
| license: apache-2.0 | |
| tags: | |
| - anime | |
| - diffusion | |
| - text-to-image | |
| - image-generation | |
| library_name: diffusers | |
| pipeline_tag: text-to-image | |
| language: | |
| - en | |
|  | |
| # Aniimage-1 | |
| Aniimage-1 is the first latent diffusion model developed by 8BitStudio. | |
| The model is a 256x256 anime image generation model trained from scratch using a UNet + VAE + CLIP architecture. | |
| Aniimage-1 has been trained on 830,001 anime images from [Danbooru](https://danbooru.donmai.us/). It is not based off of any existing models, the unet is trained from scratch. | |
| ## Model Details | |
| | | | | |
| |---|---| | |
| | **Resolution** | 256×256 | | |
| | **Architecture** | Latent Diffusion (UNet + VAE + CLIP) | | |
| | **Parameters** | ~400M | | |
| | **Training Steps** | 88,000 | | |
| | **Batch Size** | 64 | | |
| | **Dataset** | ~830K curated anime images from Danbooru | | |
| | **GPU** | NVIDIA RTX 5060 Ti 16GB | | |
| | **Scheduler** | DDIM or DPM ++ 2M | | |
| ## Requirements | |
| - **GPU**: ~3.4 GB VRAM minimum (recommend 4+ GB) | |
| - **CPU**: ~2 GB RAM. Image generation is extremely slow on cpu. | |
| ## Quick Start | |
| [](https://huggingface.co/8BitStudio/Aniimage-1/resolve/main/generate_hf.py) | |
| after downloading, install the dependencies. | |
| ```bash | |
| pip install torch torchvision diffusers transformers safetensors pillow huggingface_hub | |
| python generate_hf.py | |
| ``` | |
| recommended settings: Scheduler on DPM ++ 2M with 25 steps and a cfg of 7.5. | |
| recommended negative prompt: "low quality, ugly, blurry, distorted, deformed, bad anatomy, bad proportions, extra limbs, missing limbs, watermark, | |
| text, signature, washed out, flat colors, manga panel, disfigured, poorly drawn, jpeg artifacts, cropped, out of frame" | |
| ## Prompting | |
| Aniimage uses plain text captions meaning for the best result use plain english. | |
| Do "A smiling anime girl with red hair and a school uniform" | |
| Not "1girl, solo, smile, red_hair, school_uniform, anime_coloring" | |
| ## Capabilities | |
| - Anime character generation with varied hair colors and styles | |
| - School uniforms, fantasy outfits, maid dresses, and more | |
| - Background scenes: cherry blossoms, night sky, interiors, nature | |
| ## Limitations | |
| - 256×256 resolution — fine details like hands and small features can be rough | |
| - Faces can sometimes look similar or 'melty' across different prompts | |
| - Complex multi-character scenes may have merging issues | |
| - Little to none NSFW content — trained on mostly SFW dataset only | |
| - Does worse when generating men due to dataset bias | |
| ## What's Next | |
| **Aniimage-1.5** — a 512×512 fine-tune of this model is currently in development, which will significantly improve detail and clarity. | |
| Code for training may be released at some point on github | |
| ## License | |
| Apache 2.0 |