--- title: Audiobook Generator - English to 36 Languages emoji: 📖 colorFrom: indigo colorTo: purple sdk: gradio sdk_version: "5.25.0" app_file: app.py pinned: false license: mit --- # 📖 Audiobook Generator — English to 36 Languages Paste or upload English text and generate a professionally narrated audiobook in any of **36 languages**, powered by Alibaba's Qwen3.5-Omni-Plus. ## Features - **Translation + Narration**: Translates English text and generates expressive speech in the target language - **Direct Narration**: Generate English audiobooks without translation - **29 narrator voices**: Male and female voices with different styles (cinematic, warm, dramatic, etc.) - **Smart text splitting**: Handles long texts by splitting at sentence/paragraph boundaries - **MP3 output**: Compressed for easy download and sharing - **Section pauses**: Optional natural pauses between text sections ## Setup 1. Add your **DashScope API key** as a Space Secret: - Settings → Secrets → New Secret - Name: `DASHSCOPE_API_KEY` - Value: your key ([get one here](https://www.alibabacloud.com/help/en/model-studio/get-api-key)) ## Supported Languages (36) ### ⭐ Core Languages (Best Quality) English, Chinese, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian ### Extended Languages Arabic, Bengali, Cantonese, Czech, Danish, Dutch, Filipino, Finnish, Greek, Hebrew, Hindi, Hungarian, Indonesian, Malay, Norwegian, Persian, Polish, Romanian, Swahili, Swedish, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese ## How It Works 1. Your English text is split into manageable chunks at sentence boundaries 2. Each chunk is sent to `qwen3.5-omni-plus` with instructions to translate (if needed) and narrate 3. The model generates expressive speech with audiobook-quality narration 4. All audio chunks are concatenated and converted to MP3 5. Download your audiobook! ## Limitations - Processing time: ~30-60 seconds per ~1500 characters - Extended languages may have variable voice quality compared to the core 10 - Very long texts (100k+ characters) may take significant time - The model generates speech at its own pace, so timing won't match a human narrator exactly ## Credits - Model: [Qwen3.5-Omni-Plus](https://qwen.ai) by Alibaba Cloud - API: [DashScope](https://www.alibabacloud.com/help/en/model-studio/) - UI: [Gradio](https://gradio.app)