EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech Paper • 2406.07803 • Published Jun 12, 2024
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector Paper • 2411.02625 • Published Nov 4, 2024
HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer Paper • 2307.16171 • Published Jul 30, 2023
DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for Cross-Speaker Emotion Transfer in Text-to-Speech Paper • 2505.19687 • Published May 26, 2025
JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis Paper • 2501.04904 • Published Jan 9, 2025
Toward Complex-Valued Neural Networks for Waveform Generation Paper • 2603.11589 • Published 14 days ago
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text Alignment Paper • 2401.08095 • Published Jan 16, 2024
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training Paper • 2307.16549 • Published Jul 31, 2023