Skip to content

IMS Toucan TTS claims to support voice synthesis for over 7,000 languages. I tried it out, and it does work, but the quality is just average—not excellent. It's usable if your standards aren't too high.

Unlike edge-tts, which offers several fixed voice options, this project assigns one fixed voice per language. You can fine-tune random voice characteristics, seed, gender, etc., using parameters like prosody_creativity, duration_scaling_factor, voice_seed, and emb1.

Project URL: https://github.com/DigitalPhonetics/IMS-Toucan

Local Deployment Guide

You can deploy from source by following the instructions on the official project page: https://github.com/DigitalPhonetics/IMS-Toucan

For convenience, I've prepared a Windows integrated package for those who prefer not to set it up manually.

Download the integrated package from Baidu Netdisk and extract it to a directory, such as D:/python/IMS-Toucan.

Integrated package download link: https://pan.baidu.com/s/1om62tz-fmq4o5sijmHmnMQ?pwd=dck6

After extraction, you'll find a file named espeak-ng-X64.msi. Installing it is optional but improves audio quality. Double-click and follow the default steps to install.

image.png

In the directory, you'll see three .bat files that can be executed by double-clicking.

image.png

Start API with Simple Webpage.bat:

Double-clicking this will start an API service and open a simple webpage. This API supports 24 commonly used languages and can be used with video translation software as a custom TTS interface.

image.png

The API address is http://127.0.0.1:5020/api, which can be entered in the custom TTS interface of video translation software.

Start Full Web UI.bat:

Double-clicking this launches the official IMS Toucan web interface, which supports synthesis and voice-over for all languages. Feel free to explore it.

image.png

If the browser doesn't open automatically, copy the address displayed in the terminal (as shown below) and open it manually in your browser. image.png

Start Advanced QT UI.bat:

Double-clicking this opens the built-in software interface, which is not localized into Chinese. If you're interested, you can explore it further.

image.png

Important Notes

  1. When starting, the terminal may display a lot of information, as shown below. This is not an error and can be ignored.

image.png

  1. The API and Full Web UI will automatically open the corresponding pages in your browser, while the Advanced QT UI will open the software directly.

  2. If you encounter errors mentioning https://docs.microsoft.com, close the window and rerun the .bat file as an administrator.

  3. The integrated package includes pre-loaded models, but it may check for updates upon startup, requiring access to https://huggingface.co. If you can't access this site, you'll need a VPN. If you see HTTPSConnect errors, enable global or system proxy.

Using with Video Translation Software

First, update your video translation software to the latest patch. Download it from: https://pyvideotrans.com

After starting the software, go to Menu > TTS Settings > Custom TTS Interface. Enter http://127.0.0.1:5020/api in the API address field. For the role list, you can enter any letters, such as a, b, c.

image.png

image.png

Once testing is successful, you can start using it.

image.png