I used to use edge-tts for voiceovers very smoothly, almost without any issues. Unfortunately, since the end of last year, it has started frequently reporting 403 errors. At first, it was only happening in China, and using a foreign IP could barely solve it, but now this error occurs globally. It seems that even a large company like Microsoft can't withstand everyone's crazy "free riding."
If you still want to use edge-tts now, you have to be careful and use it sparingly, especially avoiding frequent operations on the same IP. Otherwise, the Microsoft server will directly return a 403 error. For ease of understanding, the software will prompt a "rate limiting error." Here are two solutions:
- You can try deploying the interface to Cloudflare. Using its dynamic characteristics can reduce the occurrence of 403 errors. For specific methods, refer to the documentation: https://pvt9.com/edgettscf
- Or continue using it locally, but with a dynamic proxy, which means changing the IP for each request. For specific operations, see this article: https://pvt9.com/edgetts-proxy
Using Local Voiceover Models
In addition to edge-tts, you can also use some open-source local voiceover models, such as GPT-SoVITS, ChatTTS-ui, Fish-TTS, F5-TTS, CosyVoice, Clone-voice, KokoroTTS, etc. These are all free and can be used by deploying them on your own computer. However, this requires extra time for configuration and a certain level of computer hardware and hands-on skills.
If you want to try it, you can refer to this tutorial: https://pvt9.com/gptsovits, and there are more instructions in the left sidebar of the page.
Using Online Voiceover API Instead
If the hardware is not good enough, or you don't want to bother with local deployment, you can choose an online voiceover API, such as OpenAI TTS, Azure TTS, ByteDance Volcano Speech Synthesis, etc.
However, using OpenAI TTS or Azure TTS directly in China requires a VPN, and the free quota is very limited. Paying also requires a foreign phone number and credit card, which is quite troublesome. It is recommended to use an OpenAI TTS relay service or an Azure TTS relay service that can be directly accessed in China, which will be much more convenient.
If you use the official OpenAI TTS, you only need to open Menu--TTS Settings--OpenAI TTS API in the software and fill in your SK in the SK text box. You don't need to set anything else. But don't forget that you need a VPN to use it in China.
The following steps explain how to use third-party relayed OpenAI TTS voiceovers, Azure TTS voiceovers, and ByteDance speech synthesis.
Using 302.AI or Other Third-Party OpenAI TTS Voiceover Relay API
Registration and login address (free $1 credit): https://share.302.ai/pyvideo
The operation steps are very simple:
- In the software's Menu--TTS Settings--OpenAI TTS API, fill in the API URL as
https://api.302.ai/v1
. If you are using a relay API from another company, fill in the address they provide, usually ending with/v1
. - In the SK text box, fill in the API Key you created on 302.AI. If it is another third-party service, fill in the Key they provide.
Test it. If the voiceover audio can be played automatically, it means the settings are successful. After that, you can select OpenAI TTS in the voiceover channel on the software's main interface. Supported voices are:
alloy, ash, coral, echo, fable, onyx, nova, sage, shimmer
.
Using 302.AI Relayed Azure TTS
Registration and login address (free $1 credit): https://share.302.ai/pyvideo
OpenAI TTS only has 9 voices, and the Chinese pronunciation is a bit "lispy." If you don't think it's good enough, you can try Azure TTS. This is a Microsoft product, with more voices and better effects than edge-tts. However, using it directly in China requires a foreign credit card. If it is inconvenient, you can use the relay API provided by 302.AI.
Operation method:
- Create a Key on 302.AI.
- Open the software's Menu--Translation Settings--302.AI and fill in the Key. Note that this time you fill it in the "302.AI" option under the "Translation Settings" menu.
After filling it in, you can use all the voiceover roles of Azure TTS. Moreover, 302.AI also relays ByteDance speech synthesis, so ByteDance's voices can also be used directly.
Using ByteDance Speech Synthesis Separately
There is already a detailed tutorial for ByteDance speech synthesis, you can take a look: https://pvt9.com/volcenginetts.
However, note that only the general male voice and general female voice can be used by default. If you want other voices, you have to buy them separately on the ByteDance official website, which is charged monthly. If you only use it occasionally, it is not cost-effective. It is recommended to directly use 302.AI mentioned above, which can directly use various ByteDance voices, which is more convenient.