Workflow - Talking Avatar (voice cloning with Qwen-TTS)
LTX - Talking Avatar with Qwen TTS (and steady camera lora optionally if you want "true" talking avatar)
A workflow with Qwen TTS directly connected to LTX-2 to allow voice cloning for consistent voice across video generations.
Prompt the dialog directly into Qwen TTS node and use a reference audio for voice cloning.
Image to video (I2V) : https://huggingface.co/RuneXX/LTX-2-Workflows/blob/main/LTX-2%20-%20I2V%20Talking%20Avatar%20(voice%20clone%20Qwen-TTS).json
Text to video (T2V) : https://huggingface.co/RuneXX/LTX-2-Workflows/blob/main/LTX-2%20-%20T2V%20Talking%20Avatar%20(voice%20clone%20Qwen-TTS).json
Needed nodes
- https://github.com/kijai/ComfyUI-KJNodes - update to have the latest
- https://github.com/city96/ComfyUI-GGUF - update to have the latest
- https://github.com/1038lab/ComfyUI-QwenTTS - the models are auto downloaded on first use
Credit to @f0rkineye for the idea.
Btw, there are many Qwen TTS repros out now, and they all work, and they are all similar or same-ish. The one I picked was a bit random, based on it having a simple view node (as well as a complex one)
You can easily swap out the Qwen TTS node i used, for other one, if you already have Qwen TTS installed. Some alternative Qwen TTS nodes:
https://github.com/HAIGC/Comfyui-HAIGC-QwenTTS
https://github.com/DarioFT/ComfyUI-Qwen3-TTS
https://github.com/flybirdxx/ComfyUI-Qwen-TTS
Qwen-TTS has currently released 0.6b small fast model that is light on the computer, and a 1.7b more accurate model https://qwen.ai/blog?id=qwen3tts-0115
( you can of course use other TTS models such as VibeVoice from Microsoft that is very good, IndexTTS etc etc etc )
DarioFT's allows to set local model path and has nodes for training (fine-tuning) but no option for seed.
Your Image to video (I2V) workflow link seems to be broken π€
Your Image to video (I2V) workflow link seems to be broken π€
thanks for the notice ;-) fixed the link