Add files using upload-large-folder tool

Browse files

Files changed (7) hide show

.gitattributes +5 -0
README.md +82 -0
tokenizer.gguf +3 -0
vibevoice-asr-q8_0.gguf +3 -0
vibevoice-realtime-0.5B-q8_0.gguf +3 -0
voice-en-Carter_man.gguf +3 -0
voice-en-Emma.gguf +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.gguf filter=lfs diff=lfs merge=lfs -text
+voice-en-Emma.gguf filter=lfs diff=lfs merge=lfs -text
+voice-en-Carter_man.gguf filter=lfs diff=lfs merge=lfs -text
+vibevoice-realtime-0.5B-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+vibevoice-asr-q8_0.gguf filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,82 @@

+---
+license: mit
+library_name: vibevoice.cpp
+tags:
+  - tts
+  - asr
+  - speech
+  - vibevoice
+  - gguf
+  - ggml
+base_model:
+  - microsoft/VibeVoice-Realtime-0.5B
+  - microsoft/VibeVoice-ASR
+---
+# vibevoice.cpp — quantized model bundle
+Quantized GGUF weights for [vibevoice.cpp](https://github.com/mudler/vibevoice.cpp),
+a C/C++ port of Microsoft VibeVoice (TTS + ASR) on top of `ggml`.
+| File | Source | Quant | Size |
+| ---- | ------ | ----- | ---- |
+| `vibevoice-realtime-0.5B-q8_0.gguf` | `microsoft/VibeVoice-Realtime-0.5B` | Q8_0 (matmul) + F16 | ~1.6 GB |
+| `vibevoice-asr-q8_0.gguf`           | `microsoft/VibeVoice-ASR`           | Q8_0 (matmul) + F16 | ~13 GB |
+| `voice-en-Carter_man.gguf`          | upstream voice prompt cache         | F16                  | 8 MB |
+| `voice-en-Emma.gguf`                | upstream voice prompt cache         | F16                  | 6 MB |
+| `tokenizer.gguf`                    | Qwen2.5 BPE + VibeVoice specials    | —                    | 6 MB |
+## Quantization scheme
+`scripts/quantize_gguf.py` in the source repo selectively quantizes only the
+LM matmul weights — attention q/k/v/o, ffn gate/up/down, and lm_head — to
+Q8_0. Everything else (1-D conv kernels, RMSNorm scales, biases,
+layer-scale gammas, token embeddings, small scalars) passes through
+unchanged. The conv1d implementation in vibevoice.cpp casts kernels to F16
+inline rather than dequantizing on the fly, so quantizing those would
+corrupt the convolution outputs.
+Q8_0 was chosen because it's pure-Python implementable in `gguf-py` and
+gives a ~60% size reduction on the 7B ASR model with no measurable
+quality regression in the closed-loop TTS → ASR roundtrip test.
+## Quickstart
+```bash
+git clone --recursive https://github.com/mudler/vibevoice.cpp
+cd vibevoice.cpp && cmake -B build -DVIBEVOICE_BUILD_TESTS=ON && cmake --build build -j
+# Pull this bundle
+mkdir -p models && cd models
+hf download mudler/vibevoice.cpp-models --local-dir .
+cd ..
+# TTS
+build/bin/vibevoice-cli tts \
+    --model models/vibevoice-realtime-0.5B-q8_0.gguf \
+    --voice models/voice-en-Carter_man.gguf \
+    --tokenizer models/tokenizer.gguf \
+    --text "Hello world this is a test of the synthesis system." \
+    --out hello.wav
+# ASR
+build/bin/vibevoice-cli asr \
+    --model models/vibevoice-asr-q8_0.gguf \
+    --tokenizer models/tokenizer.gguf \
+    --audio hello.wav
+# -> [{"Start":0,"End":2.8,"Speaker":0,"Content":"Hello world, this is a test of the synthesis system."}]
+```
+## Closed-loop verification
+The `test_closed_loop` ctest in vibevoice.cpp runs TTS → ASR end-to-end
+and asserts ≥80% source-word recall in the recovered transcript. With
+this bundle (both Q8_0 models) it passes at 10/10 (100 %).
+## License
+Weights are derived from Microsoft VibeVoice
+([VibeVoice-Realtime-0.5B](https://huggingface.co/microsoft/VibeVoice-Realtime-0.5B)
+and [VibeVoice-ASR](https://huggingface.co/microsoft/VibeVoice-ASR));
+follow the upstream model licenses for use. The conversion + quantization
+tooling is released under MIT as part of vibevoice.cpp.

tokenizer.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:37dc3b722d5677e37e29a57df55aa05c485116eeb5459e57ff8dde616b4986f6
+size 5922368

vibevoice-asr-q8_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:39ddded77a094a1fad9031fbaaee04943d7906d314d51161976bf393cca343d6
+size 13927206208

vibevoice-realtime-0.5B-q8_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5251e3f0386d1056a90c61b6c7359a4775da44dd19402499bef1989c4b5c653a
+size 1699832128

voice-en-Carter_man.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b15cd8b9cae6ee2c3d20b0ee6e7bfe93f13489f8b63b6834e9bbf0dfabf6505a
+size 8472448

voice-en-Emma.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8c96a15786835d73d0e3e7e37af668de6f93392e04de0ada33512ff83f6cc4ba
+size 6647168