# Supplemental reading and resources

This unit introduced the text-to-speech task, and covered a lot of ground. 
Want to learn more? Here you will find additional resources that will help you deepen your understanding of the topics
and enhance your learning experience.

* [HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis](https://arxiv.org/pdf/2010.05646.pdf): a paper introducing HiFi-GAN for speech synthesis. 
* [X-Vectors: Robust DNN Embeddings For Speaker Recognition](https://www.danielpovey.com/files/2018_icassp_xvectors.pdf): a paper introducing X-Vector method for speaker embeddings.
* [FastSpeech 2: Fast and High-Quality End-to-End Text to Speech](https://arxiv.org/pdf/2006.04558.pdf): a paper introducing FastSpeech 2, another popular text-to-speech model that uses a non-autoregressive TTS method.
* [A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech](https://arxiv.org/pdf/2302.04215v1.pdf): a paper introducing MQTTS, an autoregressive TTS system that replaces mel-spectrograms with quantized discrete representation.