Update README.md
Browse files
README.md
CHANGED
|
@@ -9,13 +9,19 @@ pipeline_tag: text-to-speech
|
|
| 9 |
|
| 10 |
`wfloat-tts` is a lightweight multi-speaker English VITS text-to-speech model with speaker, emotion, and intensity control.
|
| 11 |
|
| 12 |
-
|
| 13 |
|
| 14 |
-
|
| 15 |
-
- `config.json`: model config and token mapping
|
| 16 |
-
- `src/wfloat_tts/`: a small Python inference helper
|
| 17 |
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
## Sample Outputs
|
| 21 |
|
|
@@ -70,6 +76,10 @@ You do not need to pass raw control symbols. The Python helper converts `emotion
|
|
| 70 |
|
| 71 |
## Install
|
| 72 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
```bash
|
| 74 |
pip install -e .
|
| 75 |
pip install "piper-phonemize==1.3.0" -f https://k2-fsa.github.io/icefall/piper_phonemize
|
|
@@ -163,3 +173,5 @@ Supported emotion labels:
|
|
| 163 |
- `model.safetensors` is the main inference artifact in this repo.
|
| 164 |
- `config.json` includes the token mapping needed by the processor.
|
| 165 |
- The current release uses a multi-speaker model with 20 speakers.
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
`wfloat-tts` is a lightweight multi-speaker English VITS text-to-speech model with speaker, emotion, and intensity control.
|
| 11 |
|
| 12 |
+
## On-Device packages
|
| 13 |
|
| 14 |
+
This Hugging Face repo contains the model files.
|
|
|
|
|
|
|
| 15 |
|
| 16 |
+
Wfloat also ships packages that distribute and run `wfloat-tts` locally on the user's device.
|
| 17 |
+
|
| 18 |
+
Available packages:
|
| 19 |
+
|
| 20 |
+
- [Web](https://github.com/wfloat/wfloat-web) for running in the browser, including mobile browsers
|
| 21 |
+
- [React Native](https://github.com/wfloat/react-native-wfloat) for running locally in iOS and Android apps
|
| 22 |
+
- [Python](https://github.com/wfloat/wfloat-python) for running in Python environments
|
| 23 |
+
|
| 24 |
+
Missing the platform or framework you need? [Please request it!](https://docs.google.com/forms/d/e/1FAIpQLScLjcb4lkouSQ54ZWDKJ1xlCkUpBFamF1zKRO3fno1vp1Y_IQ/viewform?usp=preview)
|
| 25 |
|
| 26 |
## Sample Outputs
|
| 27 |
|
|
|
|
| 76 |
|
| 77 |
## Install
|
| 78 |
|
| 79 |
+
For running the model from Hugging Face.
|
| 80 |
+
|
| 81 |
+
Official Python package: [wfloat-python](https://github.com/wfloat/wfloat-python).
|
| 82 |
+
|
| 83 |
```bash
|
| 84 |
pip install -e .
|
| 85 |
pip install "piper-phonemize==1.3.0" -f https://k2-fsa.github.io/icefall/piper_phonemize
|
|
|
|
| 173 |
- `model.safetensors` is the main inference artifact in this repo.
|
| 174 |
- `config.json` includes the token mapping needed by the processor.
|
| 175 |
- The current release uses a multi-speaker model with 20 speakers.
|
| 176 |
+
- Training code: [https://github.com/wfloat/piper](https://github.com/wfloat/piper)
|
| 177 |
+
- For the checkpoint needed to resume training, message `mitch@wfloat.com`.
|