This is a webui and api project for the kokoro TTS project, supporting voiceovers in 8 languages: Chinese, English, Japanese, French, Italian, Portuguese, Spanish, and Hindi.
Project address https://github.com/jianchang512/kokoro-uiapi
Web Interface
The default UI address after startup: http://127.0.0.1:5066
- Supports voiceovers for text and SRT subtitles
- Supports online listening and downloading
- Supports subtitle alignment
Installation Method
Windows
For Win10/11, you can directly download the integrated package and double-click start.bat
to start. If you need GPU acceleration, please ensure you have an NVIDIA graphics card and CUDA12 installed.
Baidu Netdisk Download Address: https://pan.baidu.com/s/1jTB84E3-gaLqFrl32f4sDw?pwd=xnwp
GitHub Download (excluding models, requires online download via VPN): https://github.com/jianchang512/kokoro-uiapi/releases/download/v0.1/kokoro-uiapi-noModels-v0.2.7z
Linux/MacOS
First, ensure that python3.8+ is installed on the system, it is recommended to use 3.10-3.11
On Linux, use
apt install ffmpeg
oryum install ffmpeg
to pre-install ffmpegOn MacOS, use
brew install ffmpeg
to install ffmpeg
- Pull the source code
git clone https://github.com/jianchang512/kokoro-uiapi
- Create and activate a virtual environment
cd kokoro-uiapi python3 -m venv venv . venv/bin/activate
- Install dependencies
pip3 install -r requirements.txt
- Start
python3 app.py
Usage in pyVideoTrans
First, start this project. For the Windows integrated package, double-click
start.bat
. For source code installation, executepython3 app.py
.Upgrade pyVideoTrans to v3.48+, open Menu -- TTS Settings - Kokoro TTS -- fill in
http://127.0.0.1:5066
for the HTTP address.
OpenAI API Compatibility
The API is compatible with OpenAI TTS.
The default API address after startup: http://127.0.0.1:5066/v1/audio/speech
Request method: POST
Request data: application/json
{
input: text to be voiced,
voice: voice role,
speed: speech rate, default is 1.0
}
Successful return of mp3 audio data
OpenAI SDK Usage Example
from openai import OpenAI
client = OpenAI(
api_key='123456',
base_url='http://127.0.0.1:5066/v1'
)
try:
response = client.audio.speech.create(
model='tts-1',
input='Hello, dear friends',
voice='zf_xiaobei',
response_format='mp3',
speed=1.0
)
with open('./test_openai.mp3', 'wb') as f:
f.write(response.content)
print("MP3 file saved successfully to test_openai.mp3")
except Exception as e:
print(f"An error occurred: {e}")
Role List
English voice roles:
af_alloy
af_aoede
af_bella
af_jessica
af_kore
af_nicole
af_nova
af_river
af_sarah
af_sky
am_adam
am_echo
am_eric
am_fenrir
am_liam
am_michael
am_onyx
am_puck
am_santa
bf_alice
bf_emma
bf_isabella
bf_lily
bm_daniel
bm_fable
bm_george
bm_lewis
Chinese roles:
zf_xiaobei
zf_xiaoni
zf_xiaoxiao
zf_xiaoyi
zm_yunjian
zm_yunxi
zm_yunxia
zm_yunyang
Japanese roles:
jf_alpha
jf_gongitsune
jf_nezumi
jf_tebukuro
jm_kumo
French roles: ff_siwis
Italian roles: if_sara,im_nicola
Hindi roles: hf_alpha,hf_beta,hm_omega,hm_psi
Spanish roles: ef_dora,em_alex,em_santa
Portuguese roles: pf_dora,pm_alex,pm_santa
Proxy VPN
Source code deployment requires downloading the voice pt file from huggingface.co. You need to set up a global or system proxy in advance to ensure accessibility.
You can also download the model in advance and extract it to the directory where app.py is located.
Model download address https://github.com/jianchang512/kokoro-uiapi/releases/download/v0.1/moxing--jieya--dao--app.py--mulu.7z