๐ŸŽ™๏ธ Live demo: Try this model in the ResembleAI/Chatterbox-Multilingual-TTS-zh-cmn Space.

Chatterbox Multilingual: Mandarin Chinese

Chatterbox Multilingual: Mandarin Chinese is a dedicated single-language finetune in the Chatterbox Multilingual V3 Single Language Pack. It is optimized for Chinese Mandarin, with language- and region-specific behavior for expressive text-to-speech and voice cloning.

Use this model when you want tighter Mandarin Chinese quality control than the broad multilingual checkpoint. For a single model that covers all supported languages, use ResembleAI/chatterbox.

Demo

Try the hosted demo Space: ResembleAI/Chatterbox-Multilingual-TTS-zh-cmn.

Files

  • t3_zh_cmn.safetensors: T3 state dict in safetensors format.
  • s3gen_v3.pt / s3gen_v3.safetensors: V3 S3Gen speech decoder checkpoint.
  • grapheme_mtl_merged_expanded_v1.json: multilingual tokenizer config.
  • ve.pt: voice encoder checkpoint used by the demo.
  • conds.pt: built-in reference voice conditioning used by the demo.

Language

  • Locale: zh-CMN
  • Chatterbox language ID: zh

Checkpoint Metadata

  • Source step: 88000
  • Source checkpoint: t3_088000.pth.tar
  • Tensor count: 292
  • Dtype: float32
  • Text embedding shape: (2454, 1024)
  • Speech embedding shape: (8194, 1024)
  • Size: 2143990176 bytes
  • SHA256: 4c51f16551e4f0a5e7bb88c58c4fc1f4ab2f0a47173665858553c8b76c900fe7

Loader Notes

This repository contains Chatterbox Multilingual V3 single-language assets used by the linked demo Space. The T3 checkpoint is loaded with multilingual vocabulary shape 2454 and S3 speech vocabulary shape 8194.

The demo combines these model-specific assets with the shared Chatterbox inference code and companion assets needed for end-to-end speech generation.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ResembleAI/Chatterbox-Multilingual-zh-cmn

Finetuned
(53)
this model

Space using ResembleAI/Chatterbox-Multilingual-zh-cmn 1