Instructions to use ResembleAI/Chatterbox-Multilingual-hi with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Chatterbox
How to use ResembleAI/Chatterbox-Multilingual-hi with Chatterbox:
# pip install chatterbox-tts import torchaudio as ta from chatterbox.tts import ChatterboxTTS model = ChatterboxTTS.from_pretrained(device="cuda") text = "Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill." wav = model.generate(text) ta.save("test-1.wav", wav, model.sr) # If you want to synthesize with a different voice, specify the audio prompt AUDIO_PROMPT_PATH="YOUR_FILE.wav" wav = model.generate(text, audio_prompt_path=AUDIO_PROMPT_PATH) ta.save("test-2.wav", wav, model.sr) - Notebooks
- Google Colab
- Kaggle
๐๏ธ Live demo: Try this model in the
ResembleAI/Chatterbox-Multilingual-TTS-hiSpace.
Chatterbox Multilingual: Hindi
Chatterbox Multilingual: Hindi is a dedicated single-language finetune in the Chatterbox Multilingual V3 Single Language Pack. It is optimized for Hindi, with language- and region-specific behavior for expressive text-to-speech and voice cloning.
Use this model when you want tighter Hindi quality control than the broad multilingual checkpoint. For a single model that covers all supported languages, use ResembleAI/chatterbox.
Demo
Try the hosted demo Space: ResembleAI/Chatterbox-Multilingual-TTS-hi.
Files
t3_hi.safetensors: T3 state dict in safetensors format.s3gen_v3.pt/s3gen_v3.safetensors: V3 S3Gen speech decoder checkpoint.grapheme_mtl_merged_expanded_v1.json: multilingual tokenizer config.
Language
- Locale:
hi - Chatterbox language ID:
hi
Checkpoint Metadata
- Source step:
131000 - Source checkpoint:
t3_131000.pth.tar - Tensor count:
292 - Dtype:
float32 - Text embedding shape:
(2454, 1024) - Speech embedding shape:
(8194, 1024) - Size:
2143990224bytes - SHA256:
89fd813802e2cf7350d609959cec4dae63dd58f445651b05009262d7e24780f9
Loader Notes
This repository contains Chatterbox Multilingual V3 single-language assets used by the linked demo Space. The T3 checkpoint is loaded with multilingual vocabulary shape 2454 and S3 speech vocabulary shape 8194.
The demo combines these model-specific assets with the shared Chatterbox inference code and companion assets needed for end-to-end speech generation.
- Downloads last month
- -
Model tree for ResembleAI/Chatterbox-Multilingual-hi
Base model
ResembleAI/chatterbox