| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - amphion/Emilia-Dataset |
| | language: |
| | - de |
| | - en |
| | base_model: |
| | - neuphonic/neucodec |
| | tags: |
| | - audio |
| | - speech |
| | --- |
| | |
| | ## NeuCodec decoder fine-tuned for German speech |
| |
|
| | This is just the decoder of [neuphonic/neucodec](https://huggingface.co/neuphonic/neucodec), fine-tuned on equal amounts of German and English speech data from Emilia-Yodas, to enhance decoding quality of German speech. |
| | Since we only fine-tuned the decoder, the codebook is identical to the base model, meaning this model can be used with the regular NeuCodec encoder. |
| |
|
| | We supply a compact class `NeuCodecDecoder.py` to easily run inference with this decoder since the NeuCodec codebase doesn't easily allow loading model files from foreign HuggingFace repos. |
| |
|
| | ### Inference Example |
| |
|
| | ```python |
| | import torch |
| | import torchaudio |
| | |
| | from NeuCodecDecoder import NeuCodecDecoder |
| | |
| | decoder_model = NeuCodecDecoder.from_pretrained("DigitalLearningGmbH/neucodec-decoder-ft-de") |
| | decoder_model = decoder_model.eval().cuda() |
| | |
| | with torch.no_grad(): |
| | decoded = decoder_model.decode_code(torch.tensor(tokens).unsqueeze(0).unsqueeze(0).to('cuda')).cpu() |
| | |
| | torchaudio.save("decoded.wav", decoded[0, :, :], 24_000) |
| | ``` |
| |
|
| | For more information please refer to [the original model card](https://huggingface.co/neuphonic/neucodec). |