Integrate Minimaxi Text-to-Speech into Custom TTS API Interface

Since edge-tts is no longer as reliable, voiceovers have become more troublesome. Free options require local deployment of solutions like GPT-SoVITS/CosyVoice/F5-TTS/Kokoro/ChatTTS.

While OpenAI TTS struggles with Chinese pronunciation, leading to a noticeable lisp, AzureTTS/ByteDance TTS/302.AI offer the best online Chinese voice synthesis currently.

After the v3.62 patch, the Custom TTS API Interface now includes built-in text-to-speech from Minimaxi (the parent company of Conch AI). It supports dozens of voices and 15 languages, with options to customize emotion, pitch, and more, making it a viable choice.

Introduction to the Integration Methods

There are two integration methods: The first is through 302.AI, which is simpler. Registration is instant, requires no real-name authentication, and has fewer restrictions, making it the recommended option. The second is through native Minimaxi.com integration, which is slightly more complex, has a lower request frequency limit (3 times per minute), and requires real-name authentication with bank card and phone number verification.

One: Integrate through 302.AI

This essentially uses Minimaxi's voiceovers but is routed through 302.AI for added convenience. 302.AI Registration Link (register through this link to receive a $1 credit): https://share.302.ai/pyvideo

First, upgrade pyVideoTrans to v3.62 (Upgrade Link: https://pvt9.com/downpackage)
Then, navigate to Menu -- TTS Settings -- Custom TTS API. As shown in the image, enter https://api.302.ai/minimaxi/v1/t2a_v2 in the API field. Paste the following roles into the Voiceover Role Name field. The voiceover roles are the same for both integration methods.

青涩青年音色:male-qn-qingse,
精英青年音色:male-qn-jingying,
霸道青年音色:male-qn-badao,
青年大学生音色:male-qn-daxuesheng,
少女音色:female-shaonv,
御姐音色:female-yujie,
成熟女性音色:female-chengshu,
甜美女性音色:female-tianmei,
男性主持人:presenter_male,
女性主持人:presenter_female,
男性有声书1:audiobook_male_1,
男性有声书2:audiobook_male_2,
女性有声书1:audiobook_female_1,
女性有声书2:audiobook_female_2,
青涩青年音色-beta:male-qn-qingse-jingpin,
精英青年音色-beta:male-qn-jingying-jingpin,
霸道青年音色-beta:male-qn-badao-jingpin,
青年大学生音色-beta:male-qn-daxuesheng-jingpin,
少女音色-beta:female-shaonv-jingpin,
御姐音色-beta:female-yujie-jingpin,
成熟女性音色-beta:female-chengshu-jingpin,
甜美女性音色-beta:female-tianmei-jingpin,
聪明男童:clever_boy,
可爱男童:cute_boy,
萌萌女童:lovely_girl,
卡通猪小琪:cartoon_pig,
病娇弟弟:bingjiao_didi,
俊朗男友:junlang_nanyou,
纯真学弟:chunzhen_xuedi,
冷淡学长:lengdan_xiongzhang,
霸道少爷:badao_shaoye,
甜心小玲:tianxin_xiaoling,
俏皮萌妹:qiaopi_mengmei,
妩媚御姐:wumei_yujie,
嗲嗲学妹:diadia_xuemei,
淡雅学姐:danya_xuejie,
Santa Claus:Santa_Claus,
Grinch:Grinch,
Rudolph:Rudolph,
Arnold:Arnold,
Charming Santa:Charming_Santa,
Charming Lady:Charming_Lady,
Sweet Girl:Sweet_Girl,
Cute Elf:Cute_Elf,
Attractive Girl:Attractive_Girl,
Serene Woman:Serene_Woman

Copy the API KEY from the 302.AI backend and paste it into the SK field in the software.

The final configuration should look like the image below. Test it, and if the audio plays correctly, the configuration is correct. Save it to start using it.

Two: Native Minimaxi Integration

Registration and Login Address: https://platform.minimaxi.com/login After logging in, you will need to complete real-name verification with your bank card number and bank-registered phone number. After successful verification, open this address: https://platform.minimaxi.com/user-center/basic-information Copy the groupID

Then, open Menu -- TTS Settings -- Custom TTS API in the software. In the API address field, enter the following, making sure to replace 你复制的groupID with your copied groupID: https://api.minimaxi.chat/v1/t2a_v2?GroupId=你复制的groupID

Enter the API key in the SK field. You can create it at this address: https://platform.minimaxi.com/user-center/basic-information/interface-key

The voiceover role configuration is the same as with 302.AI. The completed configuration should look like the image below:

Note that if you have not passed real-name authentication, the test may fail. Also, when using this method, please open Menu -- Tools/Options -- Advanced Options -- Voiceover Adjustments -- Set the Number of Simultaneous Voiceovers to 1, and the Pause Time After Voiceover to a value greater than 25. Otherwise, you are likely to exceed the frequency limit. Regular users are only allowed 3 requests per minute, i.e., one request every 20 seconds.

Pronunciation Language Selection

Supports 15 languages: Chinese, Cantonese, English, Spanish, French, Russian, German, Portuguese, Arabic, Italian, Japanese, Korean, Indonesian, Vietnamese, Turkish, Dutch, Ukrainian.

When using voiceover in the software interface, select the language of the subtitles. Note that it must be within the above 15 languages. Only when you need Cantonese pronunciation, do you need to open the Custom TTS API interface and set the language to Chinese,Yue. At all other times, please ensure that auto is selected here.

Pronunciation Emotion Selection

Minimaxi supports 7 emotions: Happy, Sad, Angry, Fear, Disgust, Surprise, Neutral. However, tests have found little difference between them. If needed, you can set it in this interface.

Finally, unless you have enabled an enterprise account with Minimaxi and have a high level, it is recommended to use the 302.AI integration method. Otherwise, 3 requests per minute for subtitle voiceover will either be unbearably slow or frequently report a rate limit frequency limit error. 302.AI Registration Link (with $1 trial credit): https://share.302.ai/pyvideo