DigitalLearningGmbH
/

neucodec-decoder-ft-de

Model card Files Files and versions

neucodec-decoder-ft-de / README.md

LenDigLearn's picture

Update README.md

5a13d1a verified 3 months ago

|

history blame contribute delete

1.3 kB

	---
	license: apache-2.0
	datasets:
	- amphion/Emilia-Dataset
	language:
	- de
	- en
	base_model:
	- neuphonic/neucodec
	tags:
	- audio
	- speech
	---

	## NeuCodec decoder fine-tuned for German speech

	This is just the decoder of [neuphonic/neucodec](https://huggingface.co/neuphonic/neucodec), fine-tuned on equal amounts of German and English speech data from Emilia-Yodas, to enhance decoding quality of German speech.
	Since we only fine-tuned the decoder, the codebook is identical to the base model, meaning this model can be used with the regular NeuCodec encoder.

	We supply a compact class `NeuCodecDecoder.py` to easily run inference with this decoder since the NeuCodec codebase doesn't easily allow loading model files from foreign HuggingFace repos.

	### Inference Example

	```python
	import torch
	import torchaudio

	from NeuCodecDecoder import NeuCodecDecoder

	decoder_model = NeuCodecDecoder.from_pretrained("DigitalLearningGmbH/neucodec-decoder-ft-de")
	decoder_model = decoder_model.eval().cuda()

	with torch.no_grad():
	decoded = decoder_model.decode_code(torch.tensor(tokens).unsqueeze(0).unsqueeze(0).to('cuda')).cpu()

	torchaudio.save("decoded.wav", decoded[0, :, :], 24_000)
	```

	For more information please refer to [the original model card](https://huggingface.co/neuphonic/neucodec).