ByteDance Volcano Speech Synthesis
Speech synthesis is the process of converting text into spoken audio. There are several excellent open-source solutions available, such as GPT-SoVITS and ChatTTS, as well as the free edge-tts. Of course, there are also commercial-grade services like ByteDance Volcano's speech synthesis service. While free options are naturally preferred initially, commercial services are more suitable for higher quality. Especially with the development of large models, the price is getting lower and lower. Choosing a commercial-grade API for voice-over is also a good option.
In version 2.88 and later, ByteDance Volcano Engine's speech synthesis service has been added, supporting voice-over in 8 languages: Chinese, English, Japanese, Portuguese, Spanish, Thai, Vietnamese, and Indonesian. In Chinese, it also supports various dialects such as Northeastern Mandarin and Sichuanese. There are 20,000 free requests, which can synthesize approximately 10 hours of speech.
Supported Chinese Voices
Only some Chinese voices are displayed. View the other 7 languages' voices here: https://www.volcengine.com/docs/6561/97465
There are many supported Chinese voices, including various dialects and popular Douyin movie commentary voices such as Xiaoshuai and Xiaomei.
Voice Name | voice_type |
---|---|
Can Can 2.0 | BV700_V2_streaming |
Yang Yang | BV705_streaming |
Sunny Youth | BV123_streaming |
Anti-involution Youth | BV120_streaming |
Common Son-in-law | BV119_streaming |
Ancient Style Young Lady | BV115_streaming |
Overbearing Mature Uncle | BV107_streaming |
Simple Youth | BV100_streaming |
Gentle Lady | BV104_streaming |
Cheerful Youth | BV004_streaming |
Sweet Pampered Young Lady | BV113_streaming |
Refined Youth | BV102_streaming |
Sweet Xiao Yuan | BV405_streaming |
Kind Female Voice | BV007_streaming |
Intellectual Female Voice | BV009_streaming |
Cheng Cheng | BV419_streaming |
Tong Tong | BV415_streaming |
Kind Male Voice | BV008_streaming |
Dubbing Male Voice | BV408_streaming |
Lazy Little Sheep | BV426_streaming |
Fresh Literary Female Voice | BV428_streaming |
Chicken Soup Female Voice | BV403_streaming |
Wise Elder | BV158_streaming |
Loving Grandma | BV157_streaming |
Rapping Bro | BR001_streaming |
Energetic Commentary Male | BV410_streaming |
Movie Commentary Xiaoshuai | BV411_streaming |
Commentary Xiaoshuai - Multi-Emotion | BV437_streaming |
Movie Commentary Xiaomei | BV412_streaming |
纨绔青年 | BV159_streaming |
直播一姐 | BV418_streaming |
Anti-involution Youth | BV120_streaming |
Calm Commentary Male | BV142_streaming |
Elegant Youth | BV143_streaming |
How to Activate
Of course, you must first register, log in, and complete real-name authentication.
https://console.volcengine.com/
Open this address to register and log in, and complete real-name authentication.
After entering the console, open the Speech Technology page as shown in the figure below.
You can also click this address to directly enter https://console.volcengine.com/speech/app
Then, create an application as shown in the figure below. Fill in the name and introduction as you like, but the key is to select "Speech Synthesis Service" and then confirm.
Next, enter the speech synthesis page to activate the free trial.
Go to the address https://console.volcengine.com/speech/service/8
Select the application you just created at the top and click "Try" to activate.
Copy the 3 parameters and you can fill them in the video translation software.
The first is
cluster id
. Copy the name under the cluster id as shown in the figure.The second is
App id
. Scroll down on this page to see it.The third is
Access Token
, which is on the right side ofApp id
. Copy it.Fill it in the video translation software. Open the Menu - TTS Settings - ByteDance Volcano Speech Synthesis window, fill it in, and save it after testing.
Using it in Video Translation Software
After filling in and testing without any problems, first select the target language in the software, and then select ByteDance Volcano Speech Synthesis in the voice-over channel. You can click to listen to each voice.
Select a satisfactory voice to start the voice-over operation.
Special Note
If you have activated the official version, only the General Male and General Female roles are available by default. Other roles need to be purchased and activated separately in the ByteDance Volcano backend.