richiejp commited on
Commit
0872e4e
·
verified ·
1 Parent(s): 5d1a0ad

Sync model card with upstream GitHub inference README

Browse files
Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -15,7 +15,7 @@ license: apache-2.0
15
  acoustic echo cancellation (AEC), noise suppression, and dereverberation of
16
  16 kHz speech, designed to run on commodity CPUs in real time.
17
 
18
- - ~0.9 M parameters (~3.5 MB F32)
19
  - ~1.66 ms per 16 ms frame on Zen4 (24 threads) — **≈9.6× realtime**
20
  - Causal, streaming: 256-sample hop, 16 ms algorithmic latency
21
  - F32 reference inference in C++ via [GGML](https://github.com/ggml-org/ggml);
@@ -90,8 +90,8 @@ that implementation to this one:
90
 
91
  | | DeepVQE (our re-implementation) | LocalVQE |
92
  |---|---|---|
93
- | Parameters | ~7.5 M | ~0.9 M |
94
- | Weights (F32) | ~30 MB | ~3.5 MB |
95
  | Analysis | STFT (complex FFT) | DCT-II (real, in-graph) |
96
  | Bottleneck | GRU | S4D (diagonal state space) |
97
  | CCM arithmetic | Complex | Real-valued (GGML-friendly) |
@@ -105,8 +105,8 @@ parameter count vs GRU at similar quality.
105
 
106
  | File | Size | Description |
107
  |---|---|---|
108
- | `localvqe-v1.pt` | 11 MB | PyTorch checkpoint — DNS5 pre-training + ICASSP 2022/2023 AEC Challenge fine-tune. |
109
- | `localvqe-v1-f32.gguf` | 5 MB | GGML F32 export (BN-folded, DCT weights embedded). This is what the C++ inference engine loads. |
110
 
111
  Only F32 GGUF is published today. A `quantize` tool is included in the C++
112
  build (see below) and the architecture is designed to be Q4_K / Q8_0
@@ -173,7 +173,7 @@ omit them rather than publish misleading figures.
173
  | Decoder | 5 sub-pixel conv + BN blocks, mirroring encoder |
174
  | CCM | 27-ch → 3×3 complex convolving mask (real-valued arithmetic) |
175
  | Kernel | (4, 4) time × freq, causal padding |
176
- | Parameters | ~0.9 M |
177
 
178
  ## Building the C++ Inference Engine
179
 
@@ -237,14 +237,14 @@ for the queue, the "quiet" column is what you'll see.
237
 
238
  ## Running Inference
239
 
240
- Download `localvqe-v1-f32.gguf` from this repository (the file list above)
241
  either via `huggingface-cli`, the Hub web UI, or `hf_hub_download` from
242
  `huggingface_hub`. Then:
243
 
244
  ### CLI
245
 
246
  ```bash
247
- ./ggml/build/bin/localvqe localvqe-v1-f32.gguf \
248
  --in-wav mic.wav ref.wav \
249
  --out-wav enhanced.wav
250
  ```
@@ -254,7 +254,7 @@ Expects 16 kHz mono PCM for both mic and far-end reference.
254
  ### Benchmark
255
 
256
  ```bash
257
- ./ggml/build/bin/bench localvqe-v1-f32.gguf \
258
  --in-wav mic.wav ref.wav --iters 10 --profile
259
  ```
260
 
@@ -278,7 +278,7 @@ in the C++ build can produce GGUF variants from the F32 reference for
278
  experimentation:
279
 
280
  ```bash
281
- ./ggml/build/bin/quantize localvqe-v1-f32.gguf localvqe-v1-q8.gguf Q8_0
282
  ```
283
 
284
  Expect end-to-end quality loss until proper per-tensor selection and
@@ -286,7 +286,7 @@ calibration have been worked through.
286
 
287
  ## PyTorch Reference
288
 
289
- `localvqe-v1.pt` is the PyTorch checkpoint used to produce the GGUF export.
290
  It is provided for verification, ablation, and downstream research — not
291
  for end-user inference, which should go through the GGML build above. The
292
  model definition lives under `pytorch/` in the
 
15
  acoustic echo cancellation (AEC), noise suppression, and dereverberation of
16
  16 kHz speech, designed to run on commodity CPUs in real time.
17
 
18
+ - 1.3 M parameters (~5 MB F32)
19
  - ~1.66 ms per 16 ms frame on Zen4 (24 threads) — **≈9.6× realtime**
20
  - Causal, streaming: 256-sample hop, 16 ms algorithmic latency
21
  - F32 reference inference in C++ via [GGML](https://github.com/ggml-org/ggml);
 
90
 
91
  | | DeepVQE (our re-implementation) | LocalVQE |
92
  |---|---|---|
93
+ | Parameters | ~7.5 M | 1.3 M |
94
+ | Weights (F32) | ~30 MB | ~5 MB |
95
  | Analysis | STFT (complex FFT) | DCT-II (real, in-graph) |
96
  | Bottleneck | GRU | S4D (diagonal state space) |
97
  | CCM arithmetic | Complex | Real-valued (GGML-friendly) |
 
105
 
106
  | File | Size | Description |
107
  |---|---|---|
108
+ | `localvqe-v1-1.3M.pt` | 11 MB | PyTorch checkpoint — DNS5 pre-training + ICASSP 2022/2023 AEC Challenge fine-tune. |
109
+ | `localvqe-v1-1.3M-f32.gguf` | 5 MB | GGML F32 export (BN-folded, DCT weights embedded). This is what the C++ inference engine loads. |
110
 
111
  Only F32 GGUF is published today. A `quantize` tool is included in the C++
112
  build (see below) and the architecture is designed to be Q4_K / Q8_0
 
173
  | Decoder | 5 sub-pixel conv + BN blocks, mirroring encoder |
174
  | CCM | 27-ch → 3×3 complex convolving mask (real-valued arithmetic) |
175
  | Kernel | (4, 4) time × freq, causal padding |
176
+ | Parameters | 1.3 M |
177
 
178
  ## Building the C++ Inference Engine
179
 
 
237
 
238
  ## Running Inference
239
 
240
+ Download `localvqe-v1-1.3M-f32.gguf` from this repository (the file list above)
241
  either via `huggingface-cli`, the Hub web UI, or `hf_hub_download` from
242
  `huggingface_hub`. Then:
243
 
244
  ### CLI
245
 
246
  ```bash
247
+ ./ggml/build/bin/localvqe localvqe-v1-1.3M-f32.gguf \
248
  --in-wav mic.wav ref.wav \
249
  --out-wav enhanced.wav
250
  ```
 
254
  ### Benchmark
255
 
256
  ```bash
257
+ ./ggml/build/bin/bench localvqe-v1-1.3M-f32.gguf \
258
  --in-wav mic.wav ref.wav --iters 10 --profile
259
  ```
260
 
 
278
  experimentation:
279
 
280
  ```bash
281
+ ./ggml/build/bin/quantize localvqe-v1-1.3M-f32.gguf localvqe-v1-1.3M-q8.gguf Q8_0
282
  ```
283
 
284
  Expect end-to-end quality loss until proper per-tensor selection and
 
286
 
287
  ## PyTorch Reference
288
 
289
+ `localvqe-v1-1.3M.pt` is the PyTorch checkpoint used to produce the GGUF export.
290
  It is provided for verification, ablation, and downstream research — not
291
  for end-user inference, which should go through the GGML build above. The
292
  model definition lives under `pytorch/` in the