The IMS Toucan TTS project claims to support text-to-speech in over 7000 languages. I tried it out, and it does work, but the quality is just okay. It's not excellent, but it's usable if you don't have high expectations.
Unlike edge-tts, this project doesn't offer several fixed voice options. Instead, each language has one fixed voice. You can fine-tune random voices, seeds, and genders using parameters like prosody_creativity/duration_scaling_factor/voice_seed/emb1
.
Project address: https://github.com/DigitalPhonetics/IMS-Toucan
Local Deployment Method
You can go directly to the project's official website and deploy it from the source code according to the instructions: https://github.com/DigitalPhonetics/IMS-Toucan
I've also created a Windows integrated package for those who don't want to go through the hassle of deploying it manually.
Download the integrated package from Baidu Netdisk and extract it to a directory, such as D:/python/IMS-Toucan
.
Integrated package download address: https://pan.baidu.com/s/1om62tz-fmq4o5sijmHmnMQ?pwd=dck6
After extracting, you'll find a file named espeak-ng-X64.msi
. You can choose to install it or not. Installing it will improve the sound quality. Just double-click and follow the default steps.
You will see three .bat
files in the directory, which can be executed by double-clicking.
Start api and simple webpage.bat:
Double-clicking this will start an API service and open a simple webpage, which can be used to connect to the custom TTS interface of video translation software. This API only supports 24 commonly used languages.
The API address is
http://127.0.0.1:5020/api
, which can be filled in the custom TTS interface of the video translation software.
Start complete webpage ui.bat:
Double-clicking this will start the official IMS Toucan web interface, which supports the synthesis and dubbing of all languages. You can try to explore it yourself.
If the browser does not automatically open the page, manually copy the address and open it in the browser when the terminal displays the following:
Start advanced QT-ui.bat:
Double-clicking this will start the built-in software interface. This interface has not been localized. If you are interested, you can study it.
Important Notes
- When starting, the terminal window may display a lot of information, as shown in the figure below. This is not an error, just ignore it.
The API and the complete web UI will automatically open the corresponding page in the browser after they are started, and the advanced QT will automatically open the software.
Sometimes a bunch of errors may be prompted, including a Microsoft website
https://docs.microsoft.com
. At this time, please close the window and re-run the bat file as administrator.The integrated package comes with a model, but it may detect whether there is a model update when starting. It needs to connect to
https://huggingface.co
. If you are outside of China, you might need a proxy. When the wordHTTPSConnect
appears in the error, it means you need a global or system proxy.
Using in Video Translation Software
First, upgrade the video translation software to the latest patch package. Download address: https://pyvideotrans.com
After starting the software, click Menu - TTS Settings - Custom TTS Interface, and fill in http://127.0.0.1:5020/api
in the API address. You can fill in any letters in the role list, such as a,b,c
, etc.
After testing and confirming that there are no problems, you can use it.