Qwen3-TTS Releases Open-Source Voice-Cloning And Generation

Qwen releases the Qwen3-TTS family of open-source text-to-speech models, publishing tokenizers and models under the Apache 2.0 license. The models, trained on over 5 million hours across 10 languages, support three-second voice cloning, description-based control, streaming real-time synthesis via a dual-track LM, and state-of-the-art results on multilingual and long-speech benchmarks. Hugging Face hosts 0.6B (2.52GB) and 1.7B (4.54GB) variants with a browser demo that enables voice cloning.
Scoring Rationale
Strong novelty and open-source release with large training data and demos; limited by incremental advances over prior voice-cloning systems.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.
Sources
- Read OriginalQwen3-TTS Family is Now Open Sourced: Voice Design, Clone, and Generationsimonwillison.net



