Q

Qwen3-TTS-Flash

Text to Audio

The Qwen3-TTS-Flash is Tongyi's latest offline text-to-speech foundation model, featuring 17 expressive voices while enabling low-latency, high-stability audio synthesis. It supports multilingual and dialect outputs with consistent voice characteristics across languages. Trained on massive datasets, the system automatically adjusts vocal tones based on text semantics and demonstrates robust capabilities for synthesizing complex content. This model is provided as a snapshot version. This version is functionally equivalent to snapshot qwen3-tts-flash-2025-11-27. https://bailian.console.alibabacloud.com/cn-beijing?tab=model#/model-market/detail/qwen3-tts-flash?serviceSite=asia-pacific-china

Input

1
Min: 0.1Max: 2
1
Min: 0.5Max: 2
1
Min: 0.5Max: 2

Result

Voice preview will appear here

Please enter text or describe audio to start generating