Skip to content

ByteDance Volcano Speech Synthesis

Speech synthesis is the process of converting text into spoken audio. There are several excellent open-source solutions available, such as GPT-SoVITS and ChatTTS, as well as the free edge-tts. Of course, there are also commercial-grade services like ByteDance Volcano's speech synthesis service. While free options are naturally preferred initially, commercial services are more suitable for higher quality. Especially with the development of large models, the price is getting lower and lower. Choosing a commercial-grade API for voice-over is also a good option.

In version 2.88 and later, ByteDance Volcano Engine's speech synthesis service has been added, supporting voice-over in 8 languages: Chinese, English, Japanese, Portuguese, Spanish, Thai, Vietnamese, and Indonesian. In Chinese, it also supports various dialects such as Northeastern Mandarin and Sichuanese. There are 20,000 free requests, which can synthesize approximately 10 hours of speech.

Supported Chinese Voices

Only some Chinese voices are displayed. View the other 7 languages' voices here: https://www.volcengine.com/docs/6561/97465

There are many supported Chinese voices, including various dialects and popular Douyin movie commentary voices such as Xiaoshuai and Xiaomei.

Voice Namevoice_type
Can Can 2.0BV700_V2_streaming
Yang YangBV705_streaming
Sunny YouthBV123_streaming
Anti-involution YouthBV120_streaming
Common Son-in-lawBV119_streaming
Ancient Style Young LadyBV115_streaming
Overbearing Mature UncleBV107_streaming
Simple YouthBV100_streaming
Gentle LadyBV104_streaming
Cheerful YouthBV004_streaming
Sweet Pampered Young LadyBV113_streaming
Refined YouthBV102_streaming
Sweet Xiao YuanBV405_streaming
Kind Female VoiceBV007_streaming
Intellectual Female VoiceBV009_streaming
Cheng ChengBV419_streaming
Tong TongBV415_streaming
Kind Male VoiceBV008_streaming
Dubbing Male VoiceBV408_streaming
Lazy Little SheepBV426_streaming
Fresh Literary Female VoiceBV428_streaming
Chicken Soup Female VoiceBV403_streaming
Wise ElderBV158_streaming
Loving GrandmaBV157_streaming
Rapping BroBR001_streaming
Energetic Commentary MaleBV410_streaming
Movie Commentary XiaoshuaiBV411_streaming
Commentary Xiaoshuai - Multi-EmotionBV437_streaming
Movie Commentary XiaomeiBV412_streaming
纨绔青年BV159_streaming
直播一姐BV418_streaming
Anti-involution YouthBV120_streaming
Calm Commentary MaleBV142_streaming
Elegant YouthBV143_streaming

How to Activate

  1. Of course, you must first register, log in, and complete real-name authentication.

    https://console.volcengine.com/

    Open this address to register and log in, and complete real-name authentication.

  2. After entering the console, open the Speech Technology page as shown in the figure below.

    image.png

    You can also click this address to directly enter https://console.volcengine.com/speech/app

    Then, create an application as shown in the figure below. Fill in the name and introduction as you like, but the key is to select "Speech Synthesis Service" and then confirm.

    image.png

  3. Next, enter the speech synthesis page to activate the free trial.

    Go to the address https://console.volcengine.com/speech/service/8

    Select the application you just created at the top and click "Try" to activate.

    image.png

  4. Copy the 3 parameters and you can fill them in the video translation software.

    The first is cluster id. Copy the name under the cluster id as shown in the figure.

    image.png

    The second is App id. Scroll down on this page to see it.

    image.png

    The third is Access Token, which is on the right side of App id. Copy it.

    image.png

  5. Fill it in the video translation software. Open the Menu - TTS Settings - ByteDance Volcano Speech Synthesis window, fill it in, and save it after testing.

    image.png

Using it in Video Translation Software

After filling in and testing without any problems, first select the target language in the software, and then select ByteDance Volcano Speech Synthesis in the voice-over channel. You can click to listen to each voice.

image.png

Select a satisfactory voice to start the voice-over operation.

Special Note

If you have activated the official version, only the General Male and General Female roles are available by default. Other roles need to be purchased and activated separately in the ByteDance Volcano backend.