I used to rely on edge-tts for voiceovers, and it worked seamlessly with almost no issues. Unfortunately, since the end of last year, it has been frequently reporting 403 errors. Initially, this was only happening in China, and using a foreign IP could temporarily solve the problem, but now this error occurs globally. It seems that even a large company like Microsoft can't withstand everyone's excessive "free riding."
If you still want to use edge-tts, you need to be careful and use it sparingly, especially avoid frequent operations on the same IP. Otherwise, the Microsoft server will directly return a 403 error. For clarity, the software will display a "rate limit error." Here are two solutions:
- Try deploying the interface to Cloudflare to leverage its dynamic characteristics, which can reduce the occurrence of 403 errors. For specific instructions, refer to this document: https://pvt9.com/edgettscf
- Alternatively, continue using it locally but with a dynamic proxy, which means changing the IP for each request. For specific instructions, check out this article: https://pvt9.com/edgetts-proxy
Using Local Voiceover Models
Besides edge-tts, you can also use some open-source local voiceover models, such as GPT-SoVITS, ChatTTS-ui, Fish-TTS, F5-TTS, CosyVoice, Clone-voice, KokoroTTS, etc. These are all free and can be used after deploying them on your own computer. However, this requires some extra time for configuration and a certain level of computer hardware and hands-on skills.
If you want to try it out, you can refer to this tutorial: https://pvt9.com/gptsovits. The left sidebar of the page also provides more information.
Using Online Voiceover APIs Instead
If your hardware isn't good enough, or you don't want to bother with local deployment, you can choose online voiceover APIs, such as OpenAI TTS, Azure TTS, ByteDance Volcano Speech Synthesis, etc.
However, using OpenAI TTS or Azure TTS directly in China requires a VPN, and the free quota is very limited. Paying for it also requires a foreign phone number and credit card, which is quite troublesome. It is recommended to use a domestic OpenAI TTS relay service or Azure TTS relay service, which would be much more convenient.
If you use the official OpenAI TTS, you only need to open Menu--TTS Settings--OpenAI TTS API in the software and fill in your SK in the SK text box. No further settings are needed. But don't forget that a VPN is required in China to use it.
The following steps explain how to use a third-party OpenAI TTS relay, Azure TTS voiceover, and ByteDance speech synthesis.
Using 302.AI or other third-party OpenAI TTS relay APIs
Registration and login address (gives $1 credit): https://share.302.ai/pyvideo
The steps are very simple:
- In the software's Menu--TTS Settings--OpenAI TTS API, fill in the API URL with
https://api.302.ai/v1
. If you are using another company's relay API, fill in the address they provide, usually ending with/v1
. - In the SK text box, fill in the API Key you created on 302.AI. If it's another third-party service, fill in the Key they provide.
Test it. If the voiceover audio can play automatically, it means the settings are successful. After that, you can select OpenAI TTS in the voiceover channel on the main interface of the software. Supported voices include:
alloy, ash, coral, echo, fable, onyx, nova, sage, shimmer
.
Using 302.AI's Azure TTS relay
Registration and login address (gives $1 credit): https://share.302.ai/pyvideo
OpenAI TTS only has 9 voices, and the Chinese pronunciation is a bit "lispy." If you don't think it's good enough, you can try Azure TTS. This is Microsoft's product, with more voices and better results than edge-tts. However, using it directly in China requires a foreign credit card. If it's inconvenient, you can use the relay API provided by 302.AI.
Operation method:
- Create a Key on 302.AI.
- Open the software's Menu--Translation Settings--302.AI, and fill in the Key. Note that this time, you need to fill it in under the "302.AI" option in the "Translation Settings" menu.
After filling it in, you can use all the voiceover roles of Azure TTS. Moreover, 302.AI also relays ByteDance speech synthesis, so ByteDance's voices can also be used directly.
Using ByteDance Speech Synthesis Individually
There is already a detailed tutorial for ByteDance speech synthesis, you can check it out: https://pvt9.com/volcenginetts.
However, please note that by default, only the general male and general female voices are available. If you want other voices, you have to buy them separately from the ByteDance official website, which is charged monthly. If you only use it occasionally, it is not cost-effective. It is recommended to directly use 302.AI mentioned above, which can directly use various ByteDance voices, which is more convenient.