VITS2 - Claude (Luxembourgish Gender-Neutral Voice)

A VITS2-based text-to-speech model for Luxembourgish, featuring a synthetic gender-neutral voice.

Model Description

This model was trained using the VITS2 architecture on Luxembourgish speech data from the Lëtzebuerger Online Dictionnaire (LOD) example sentences.

"Claude" is a synthetic gender-neutral Luxembourgish voice created by modulating the original LOD recordings.

Model Details

Architecture: VITS2 with duration discriminator and transformer flows
Language: Luxembourgish (lb)
Speaker: Single speaker (gender-neutral, synthetic)
Sample Rate: 24000 Hz
Checkpoint: G_57000 (57,000 steps)
License: MIT

Usage

Note: Text should be lowercased before synthesis. Additional text normalization may be required.

This model requires the included Python source files for inference.

Basic Usage

import torch
import scipy.io.wavfile as wavfile
from vits2_engine import VITS2Engine

# Load the model
engine = VITS2Engine(model_dir="path/to/vits2-claude")

# Generate speech
wav = engine.tts("moien, wéi geet et dir?")

# Save to file
wavfile.write("output.wav", engine.sample_rate, wav)

Command Line

python inference.py "moien, wéi geet et dir?"

# With custom parameters
python inference.py "Text" --noise_scale 0.5 --length_scale 1.1 -o output.wav

Parameters

noise_scale: Controls voice variation (default: 0.667, lower = more consistent)
noise_scale_w: Controls duration variation (default: 0.8)
length_scale: Controls speech speed (default: 1.0, higher = slower)

Technical Specifications

Parameter	Value
Hidden Channels	192
Filter Channels	768
Attention Heads	2
Encoder Layers	6
Mel Channels	80
FFT Size	1024
Hop Length	256

Requirements

Python 3.8+
PyTorch
scipy
numpy
Cython (for monotonic_align)

Citation

If you use this model, please cite:

@misc{zls2025vits2claude,
  title={VITS2 Claude - Luxembourgish Gender-Neutral Voice},
  author={Zenter fir d'Lëtzebuerger Sprooch},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/ZLSCompLing/VITS2-Claude}
}

Acknowledgments

Developed by Zenter fir d'Lëtzebuerger Sprooch.

Voice data sourced from the Lëtzebuerger Online Dictionnaire (LOD). The original audio files are available via the LOD linguistic data on data.public.lu, which provides an XML file containing example sentence IDs. Audio files can be accessed at:

https://lod.lu/uploads/examples/AAC/{folder}/{id}.m4a

where {folder} is the first 2 characters of {id}.

This model is used in Sproochmaschinn, a Luxembourgish speech processing platform.

Downloads last month: 7