Spaces:

PlotweaverModel
/

AudioBook

Running

App Files Files Community

AudioBook / README.md

PlotweaverModel

update

500a984 verified 7 days ago

preview code

raw

history blame contribute delete

2.4 kB

	---
	title: Audiobook Generator - English to 36 Languages
	emoji: 📖
	colorFrom: indigo
	colorTo: purple
	sdk: gradio
	sdk_version: "5.25.0"
	app_file: app.py
	pinned: false
	license: mit
	---

	# 📖 Audiobook Generator — English to 36 Languages

	Paste or upload English text and generate a professionally narrated audiobook in any of 36 languages, powered by Alibaba's Qwen3.5-Omni-Plus.

	## Features

	- Translation + Narration: Translates English text and generates expressive speech in the target language
	- Direct Narration: Generate English audiobooks without translation
	- 29 narrator voices: Male and female voices with different styles (cinematic, warm, dramatic, etc.)
	- Smart text splitting: Handles long texts by splitting at sentence/paragraph boundaries
	- MP3 output: Compressed for easy download and sharing
	- Section pauses: Optional natural pauses between text sections

	## Setup

	1. Add your DashScope API key as a Space Secret:
	- Settings → Secrets → New Secret
	- Name: `DASHSCOPE_API_KEY`
	- Value: your key ([get one here](https://www.alibabacloud.com/help/en/model-studio/get-api-key))

	## Supported Languages (36)

	### ⭐ Core Languages (Best Quality)
	English, Chinese, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian

	### Extended Languages
	Arabic, Bengali, Cantonese, Czech, Danish, Dutch, Filipino, Finnish, Greek, Hebrew, Hindi, Hungarian, Indonesian, Malay, Norwegian, Persian, Polish, Romanian, Swahili, Swedish, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese

	## How It Works

	1. Your English text is split into manageable chunks at sentence boundaries
	2. Each chunk is sent to `qwen3.5-omni-plus` with instructions to translate (if needed) and narrate
	3. The model generates expressive speech with audiobook-quality narration
	4. All audio chunks are concatenated and converted to MP3
	5. Download your audiobook!

	## Limitations

	- Processing time: ~30-60 seconds per ~1500 characters
	- Extended languages may have variable voice quality compared to the core 10
	- Very long texts (100k+ characters) may take significant time
	- The model generates speech at its own pace, so timing won't match a human narrator exactly

	## Credits

	- Model: [Qwen3.5-Omni-Plus](https://qwen.ai) by Alibaba Cloud
	- API: [DashScope](https://www.alibabacloud.com/help/en/model-studio/)
	- UI: [Gradio](https://gradio.app)