feat: spectator mode - all visitors see the same streaming generation b1e1a05 H-Liu1997 commited on 2 days ago
fix: build proper attention mask in SDPA fallback for text cross-attention 6c6483b H-Liu1997 commited on 3 days ago
perf: increase server frame buffer target to 16 for batch fetching 7237651 H-Liu1997 commited on 3 days ago
revert: remove float16 override, bfloat16 is safer (float16 overflow risk) 2cefd84 H-Liu1997 commited on 3 days ago
perf: use float16 instead of bfloat16 in SDPA for T4 tensor core acceleration 80c3e53 H-Liu1997 commited on 3 days ago
fix: adapt model_manager to HF model API (no schedule_config/cfg_config dicts) 47551f4 H-Liu1997 commited on 3 days ago
fix: cast SDPA output back to original dtype + idempotent patching e843211 H-Liu1997 commited on 3 days ago
fix: patch flash_attention with SDPA fallback for T4 (no flash-attn) bb7e158 H-Liu1997 commited on 3 days ago
feat: initial FloodDiffusion streaming demo for HF Space a4f8eb3 H-Liu1997 Claude Opus 4.6 (1M context) commited on 3 days ago