FE2E-CPU / app.py

Commit History

Revert to PyTorch INT8 (ONNX export produces NaN)
563ef1b
verified

Nekochu commited on

Reduce ORT memory: disable prepacking, basic optimization, 1 thread
10d0786
verified

Nekochu commited on

Fix: unsqueeze prompt embeddings
1620b49
verified

Nekochu commited on

Switch to ONNX INT8 DiT
6ee4bac
verified

Nekochu commited on

fixes: cudagc guard, rm conditioner.py, turbo depth colormap, proper normal viz, compact UI, example
d65d5b5

Nekochu commited on

fix: load full torch.save model directly (no FP32 construction, mmap)
9d04602

Nekochu commited on

fix OOM: use mmap loading + assign (no memory copy)
b57f6e2

Nekochu commited on

fix: eager model load at startup (lazy load causes SSE timeout)
19a7c6b

Nekochu commited on

fix: move demo to module level (Gradio SDK needs it, not inside main())
284342e

Nekochu commited on

use pre-quantized INT8 model (no FP8 casting, no LoRA merge at runtime)
a3ba5ed

Nekochu commited on

fix: diffusers dep, layer-by-layer FP8->FP32 cast, LoRA merge in FP32, INT8 quant
551acb3

Nekochu commited on

FE2E depth+normal CPU Space: FP8 dynamic INT8, single denoise
405d2b1

Nekochu commited on