Skip to content

Starting from pyVideoTrans video translation software v3.74-0720, Alibaba's Qwen-TTS speech synthesis service has been integrated!

In simple terms, Qwen-TTS is an advanced speech synthesis technology that converts text into highly realistic and natural-sounding human voices. A key highlight is its ability to automatically adjust speech rhythm and emotion based on the text content.

Qwen-TTS Model

The qwen-tts model supports both Chinese and English, as well as three dialects: Beijing dialect, Shanghai dialect (Wu Chinese), and Sichuan dialect. Model names: qwen-tts, qwen-tts-latest

Click here to view detailed voice samples and supported language descriptions for qwen-tts

Currently, Qwen-TTS supports the following voice options, all of which support both Chinese and English:

  • Beijing dialect: Dylan
  • Shanghai dialect: Jada
  • Sichuan dialect: Sunny
  • Others: Chelsie, Cherry, Ethan, Serena

Qwen3-TTS Model

The qwen3-tts model supports 10 languages and multiple Chinese dialects. Model name: qwen3-tts-flash

Click here to view detailed voice samples and supported language descriptions for qwen3-tts

Cherry
Ethan
Nofish
Jennifer
Ryan
Katerina
Elias
Jada (Shanghai)
Dylan (Beijing)
Sunny (Sichuan)
Li (Nanjing)
Marcus (Shaanxi)
Roy (Southern Min)
Peter (Tianjin)
Rocky (Cantonese)
Kiki (Cantonese)
Eric (Sichuan)

About the free quota:

Alibaba provides 1 million tokens for free with this service, which can synthesize approximately 20,000 seconds of audio, roughly 333 minutes (about 5.5 hours).

This quota is quite sufficient for most individual users' regular usage and feature testing.


How to Use the Qwen-TTS Feature?

No complicated setup is needed. With just a few simple steps, you can use the powerful Qwen-TTS in pyVideoTrans.

Step 1: Obtain and Configure Your API KEY

Alibaba provides a free quota for each user.

  1. Please click this link to visit the Alibaba Cloud Bailian platform: https://bailian.console.aliyun.com/?tab=model#/api-key

  1. Log in to your Alibaba Cloud account (if you don't have one, follow the prompts to register).

  2. On the API-KEY management page, click "Create API-KEY". The system will automatically generate a string starting with "sk-"; this is your API KEY. Please copy this string.

  3. Return to the pyVideoTrans software, find TTS Settings in the top menu bar, click it, and select Qwen TTS from the dropdown menu.

  4. In the pop-up Qwen TTS configuration window, paste the API KEY you just copied into the "API KEY" input box. You can click the "Test" button to preview the effect. If you can hear the sound, the configuration is successful. Finally, click Save.

Step 2: Use Qwen-TTS in Video Translation

After configuration, you can enable Qwen-TTS when processing a single video.

  • On the main interface of pyVideoTrans, find the "Dubbing Channel" dropdown menu, click it, and select "Qwen TTS".
  • In the adjacent "Voice Role" menu, you can choose your preferred voice, such as selecting "Cherry" for a standard female voice or "Sunny" for a fun Sichuan dialect dubbing.

Step 3: Use in Batch Dubbing and Multi-Role Dubbing

The powerful features of Qwen-TTS are also applicable to batch processing tasks, greatly improving your work efficiency.

  • Batch Dubbing for Subtitles: If you have multiple SRT subtitle files that need dubbing, switch to the "Batch Dubbing for Subtitles" interface. Similarly, select "Qwen TTS" and your desired voice role in the "Dubbing Channel" section below.

  • Multi-Role Dubbing for Subtitles: This feature is also applicable when processing dialogues involving multiple characters. You can assign different Qwen-TTS voices to different characters in the "Multi-Role Dubbing for Subtitles" functional area.