Reduce ORT memory: disable prepacking, basic optimization, 1 thread 10d0786 verified Nekochu commited on 10 days ago
fixes: cudagc guard, rm conditioner.py, turbo depth colormap, proper normal viz, compact UI, example d65d5b5 Nekochu commited on 11 days ago
fix: load full torch.save model directly (no FP32 construction, mmap) 9d04602 Nekochu commited on 12 days ago
fix: eager model load at startup (lazy load causes SSE timeout) 19a7c6b Nekochu commited on 12 days ago
fix: move demo to module level (Gradio SDK needs it, not inside main()) 284342e Nekochu commited on 12 days ago
use pre-quantized INT8 model (no FP8 casting, no LoRA merge at runtime) a3ba5ed Nekochu commited on 12 days ago
fix: diffusers dep, layer-by-layer FP8->FP32 cast, LoRA merge in FP32, INT8 quant 551acb3 Nekochu commited on 12 days ago
FE2E depth+normal CPU Space: FP8 dynamic INT8, single denoise 405d2b1 Nekochu commited on 12 days ago