Skip to content

The IMS Toucan TTS project claims to support voiceovers in over 7000 languages. I tried it out, and it does work, but the quality is just so-so, not excellent. It's usable if your requirements aren't too high.

Unlike edge-tts, this project doesn't offer several fixed voice options. Instead, each language has a fixed voice, which can be slightly adjusted using parameters like prosody_creativity/duration_scaling_factor/voice_seed/emb1 to tweak the random voice, seed, and gender.

Project address: https://github.com/DigitalPhonetics/IMS-Toucan

Local Deployment Method

You can directly deploy the source code according to the instructions on the project's official website: https://github.com/DigitalPhonetics/IMS-Toucan

I've also created a Windows integrated package for convenience, if you don't want to go through the hassle.

Download the integrated package from Baidu Netdisk and extract it to a directory, for example, D:/python/IMS-Toucan.

Integrated package download address: https://pan.baidu.com/s/1om62tz-fmq4o5sijmHmnMQ?pwd=dck6

After extracting, you'll find an espeak-ng-X64.msi file. You can choose to install it or not. Installing it will improve the sound quality. Just double-click it and follow the default installation steps.

image.png

You'll see three bat files in the directory, which you can double-click to execute.

image.png

启动api加简单网页.bat (Start API with Simple Webpage.bat):

Double-clicking this will start an API service and open a simple webpage, which can be used to connect to the custom TTS interface of video translation software. This API only supports 24 commonly used languages.

image.png

The API address is http://127.0.0.1:5020/api, which can be filled in the custom TTS interface of the video translation software.

启动完整网页ui.bat (Start Complete Web UI.bat):

Double-clicking this will start the official IMS Toucan web interface, which supports synthesis and voiceover in all languages. You can explore and experiment with it yourself.

image.png

If the browser doesn't automatically open the page, manually copy the address and open it in the browser when the terminal displays the following: image.png

启动高级QT-ui.bat (Start Advanced QT-UI.bat):

Double-clicking this will start the built-in software interface. This interface is not localized, but you can study it if you're interested.

image.png

Important Notes

  1. When starting, the terminal window may display a lot of information, as shown in the figure below. This is not an error and can be ignored.

    image.png

  2. The API and the complete web UI interface will automatically open the corresponding pages in the browser after starting. The advanced QT will automatically open the software.

  3. Sometimes, a lot of errors may be reported, including https://docs.microsoft.com (Microsoft website). In this case, close the window and re-run the bat file as an administrator by right-clicking.

  4. The integrated package comes with a built-in model, but it may check for model updates when starting, which requires connecting to https://huggingface.co. If you are outside China and cannot access it, you need to prepare a proxy. When HTTPSConnect appears in the error message, it means you need to enable a global or system proxy.

Using in Video Translation Software

First, upgrade the video translation software to the latest patch package, download address: https://pyvideotrans.com

After starting the software, click Menu - TTS Settings - Custom TTS Interface, and fill in http://127.0.0.1:5020/api in the API address field. You can fill in any letters in the role list, such as a,b,c, etc.

image.png

image.png

After testing and confirming that there are no problems, you can use it.

image.png