FloodDiffusion-Streaming / model_manager.py

Commit History

feat: spectator mode - all visitors see the same streaming generation
b1e1a05

H-Liu1997 commited on

fix: move k_lens to GPU in SDPA fallback (tested locally)
7d73321

H-Liu1997 commited on

fix: build proper attention mask in SDPA fallback for text cross-attention
6c6483b

H-Liu1997 commited on

perf: increase server frame buffer target to 16 for batch fetching
7237651

H-Liu1997 commited on

revert: remove float16 override, bfloat16 is safer (float16 overflow risk)
2cefd84

H-Liu1997 commited on

perf: use float16 instead of bfloat16 in SDPA for T4 tensor core acceleration
80c3e53

H-Liu1997 commited on

fix: adapt model_manager to HF model API (no schedule_config/cfg_config dicts)
47551f4

H-Liu1997 commited on

fix: cast SDPA output back to original dtype + idempotent patching
e843211

H-Liu1997 commited on

fix: patch flash_attention with SDPA fallback for T4 (no flash-attn)
bb7e158

H-Liu1997 commited on

feat: initial FloodDiffusion streaming demo for HF Space
a4f8eb3

H-Liu1997 Claude Opus 4.6 (1M context) commited on