Skip to content

The IMS Toucan TTS project claims to support text-to-speech in over 7000 languages. I tried it out, and it does work, but the quality is just okay. It's not excellent, but it's usable if you don't have high expectations.

Unlike edge-tts, this project doesn't offer several fixed voice options. Instead, each language has one fixed voice. You can fine-tune random voices, seeds, and genders using parameters like prosody_creativity/duration_scaling_factor/voice_seed/emb1.

Project address: https://github.com/DigitalPhonetics/IMS-Toucan

Local Deployment Method

You can go directly to the project's official website and deploy it from the source code according to the instructions: https://github.com/DigitalPhonetics/IMS-Toucan

I've also created a Windows integrated package for those who don't want to go through the hassle of deploying it manually.

Download the integrated package from Baidu Netdisk and extract it to a directory, such as D:/python/IMS-Toucan.

Integrated package download address: https://pan.baidu.com/s/1om62tz-fmq4o5sijmHmnMQ?pwd=dck6

After extracting, you'll find a file named espeak-ng-X64.msi. You can choose to install it or not. Installing it will improve the sound quality. Just double-click and follow the default steps.

image.png

You will see three .bat files in the directory, which can be executed by double-clicking.

image.png

Start api and simple webpage.bat:

Double-clicking this will start an API service and open a simple webpage, which can be used to connect to the custom TTS interface of video translation software. This API only supports 24 commonly used languages.

image.png

The API address is http://127.0.0.1:5020/api, which can be filled in the custom TTS interface of the video translation software.

Start complete webpage ui.bat:

Double-clicking this will start the official IMS Toucan web interface, which supports the synthesis and dubbing of all languages. You can try to explore it yourself.

image.png

If the browser does not automatically open the page, manually copy the address and open it in the browser when the terminal displays the following: image.png

Start advanced QT-ui.bat:

Double-clicking this will start the built-in software interface. This interface has not been localized. If you are interested, you can study it.

image.png

Important Notes

  1. When starting, the terminal window may display a lot of information, as shown in the figure below. This is not an error, just ignore it.

image.png

  1. The API and the complete web UI will automatically open the corresponding page in the browser after they are started, and the advanced QT will automatically open the software.

  2. Sometimes a bunch of errors may be prompted, including a Microsoft website https://docs.microsoft.com. At this time, please close the window and re-run the bat file as administrator.

  3. The integrated package comes with a model, but it may detect whether there is a model update when starting. It needs to connect to https://huggingface.co. If you are outside of China, you might need a proxy. When the word HTTPSConnect appears in the error, it means you need a global or system proxy.

Using in Video Translation Software

First, upgrade the video translation software to the latest patch package. Download address: https://pyvideotrans.com

After starting the software, click Menu - TTS Settings - Custom TTS Interface, and fill in http://127.0.0.1:5020/api in the API address. You can fill in any letters in the role list, such as a,b,c, etc.

image.png

image.png

After testing and confirming that there are no problems, you can use it.

image.png