Spaces:
Running
Running
metadata
title: VibeThinker WebGPU Train + Hot-Swap LoRA
emoji: 🔥
colorFrom: red
colorTo: yellow
sdk: static
app_file: index.html
pinned: false
license: apache-2.0
models:
- WeiboAI/VibeThinker-3B
short_description: In-browser WebGPU LoRA training + hot-swap (VibeThinker-3B)
VibeThinker-3B · in-browser WebGPU training + hot-swap LoRA
A static Space that runs VibeThinker-3B entirely in your browser tab on a custom WebGPU int4 runtime — no server, no upload, nothing leaves the page.
What it shows off:
- WebGPU kernels — int4 GEMV/GEMM, fused RMSNorm+QKV+RoPE, paged attention, f32 accumulation in f16 paths. The whole forward + backward stack is WGSL.
- In-browser training — fine-tune a LoRA adapter in the tab with a real backward pass + AdamW over the frozen int4 base (gradient checkpointing, f32 master weights). The guided demo teaches made-up facts in ~30s.
- Hot-swap LoRA — every trained adapter is saved locally (IndexedDB) and
hot-swaps into inference instantly; export to
.safetensorsor re-load later.
The model weights are streamed once from
WeiboAI/VibeThinker-3B on the
Hub and cached in your browser. A Space does not have to be tied to a model;
the model above is linked for discovery only.
Requires a WebGPU-capable browser (recent Chrome/Edge; Safari Technology Preview). ~6 GB weights, one-time download.