| | --- |
| | tags: |
| | - text-to-speech |
| | license: cc-by-nc-sa-4.0 |
| | language: |
| | - zh |
| | - en |
| | - de |
| | - ja |
| | - fr |
| | - es |
| | - ko |
| | - ar |
| | - nl |
| | - ru |
| | - it |
| | - pl |
| | - pt |
| | pipeline_tag: text-to-speech |
| | inference: false |
| | extra_gated_prompt: >- |
| | You agree to not use the model to generate contents that violate DMCA or local |
| | laws. |
| | extra_gated_fields: |
| | Country: country |
| | Specific date: date_picker |
| | I agree to use this model for non-commercial use ONLY: checkbox |
| | --- |
| | |
| |
|
| | # Fish Speech V1.5 |
| |
|
| | **Fish Speech V1.5** is a leading text-to-speech (TTS) model trained on more than 1 million hours of audio data in multiple languages. |
| |
|
| | Supported languages: |
| | - English (en) >300k hours |
| | - Chinese (zh) >300k hours |
| | - Japanese (ja) >100k hours |
| | - German (de) ~20k hours |
| | - French (fr) ~20k hours |
| | - Spanish (es) ~20k hours |
| | - Korean (ko) ~20k hours |
| | - Arabic (ar) ~20k hours |
| | - Russian (ru) ~20k hours |
| | - Dutch (nl) <10k hours |
| | - Italian (it) <10k hours |
| | - Polish (pl) <10k hours |
| | - Portuguese (pt) <10k hours |
| |
|
| | Please refer to [Fish Speech Github](https://github.com/fishaudio/fish-speech) for more info. |
| | Demo available at [Fish Audio](https://fish.audio/). |
| |
|
| | ## Citation |
| |
|
| | If you found this repository useful, please consider citing this work: |
| |
|
| | ``` |
| | @misc{fish-speech-v1.4, |
| | title={Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis}, |
| | author={Shijia Liao and Yuxuan Wang and Tianyu Li and Yifan Cheng and Ruoyi Zhang and Rongzhi Zhou and Yijin Xing}, |
| | year={2024}, |
| | eprint={2411.01156}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.SD}, |
| | url={https://arxiv.org/abs/2411.01156}, |
| | } |
| | ``` |
| |
|
| | ## License |
| |
|
| | This model is permissively licensed under the BY-CC-NC-SA-4.0 license. |