VITS2 - Claude (Luxembourgish Gender-Neutral Voice)

A VITS2-based text-to-speech model for Luxembourgish, featuring a synthetic gender-neutral voice.

Model Description

This model was trained using the VITS2 architecture on Luxembourgish speech data from the Lëtzebuerger Online Dictionnaire (LOD) example sentences.

"Claude" is a synthetic gender-neutral Luxembourgish voice created by modulating the original LOD recordings.

Model Details

  • Architecture: VITS2 with duration discriminator and transformer flows
  • Language: Luxembourgish (lb)
  • Speaker: Single speaker (gender-neutral, synthetic)
  • Sample Rate: 24000 Hz
  • Checkpoint: G_57000 (57,000 steps)
  • License: MIT

Usage

Note: Text should be lowercased before synthesis. Additional text normalization may be required.

This model requires the included Python source files for inference.

Basic Usage

import torch
import scipy.io.wavfile as wavfile
from vits2_engine import VITS2Engine

# Load the model
engine = VITS2Engine(model_dir="path/to/vits2-claude")

# Generate speech
wav = engine.tts("moien, wéi geet et dir?")

# Save to file
wavfile.write("output.wav", engine.sample_rate, wav)

Command Line

python inference.py "moien, wéi geet et dir?"

# With custom parameters
python inference.py "Text" --noise_scale 0.5 --length_scale 1.1 -o output.wav

Parameters

  • noise_scale: Controls voice variation (default: 0.667, lower = more consistent)
  • noise_scale_w: Controls duration variation (default: 0.8)
  • length_scale: Controls speech speed (default: 1.0, higher = slower)

Technical Specifications

Parameter Value
Hidden Channels 192
Filter Channels 768
Attention Heads 2
Encoder Layers 6
Mel Channels 80
FFT Size 1024
Hop Length 256

Requirements

  • Python 3.8+
  • PyTorch
  • scipy
  • numpy
  • Cython (for monotonic_align)

Citation

If you use this model, please cite:

@misc{zls2025vits2claude,
  title={VITS2 Claude - Luxembourgish Gender-Neutral Voice},
  author={Zenter fir d'Lëtzebuerger Sprooch},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/ZLSCompLing/VITS2-Claude}
}

Acknowledgments

Developed by Zenter fir d'Lëtzebuerger Sprooch.

Voice data sourced from the Lëtzebuerger Online Dictionnaire (LOD). The original audio files are available via the LOD linguistic data on data.public.lu, which provides an XML file containing example sentence IDs. Audio files can be accessed at:

https://lod.lu/uploads/examples/AAC/{folder}/{id}.m4a

where {folder} is the first 2 characters of {id}.

This model is used in Sproochmaschinn, a Luxembourgish speech processing platform.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support