Using zh_recogn Chinese Speech Recognition

This recognition method only supports Chinese speech. It uses the Alibaba ModelScope community model, which has good support for Chinese and can make up for the shortcomings of foreign models in supporting Chinese.

How to Use

First, deploy the zh_recogn project.

Then, start it. Fill in the address (default http://127.0.0.1:9933) in the upper left menu of the software - Settings - zh_recogn Chinese Speech Recognition - Address.

Then select zh_recogn in the "faster mode" drop-down box in the software interface. When this item is selected, there is no need to select the model and segmentation method again.

Deploying the zh_recogn Project

Source Code Deployment

First install python3.10 / install git, install ffmpeg. On Windows, download ffmpeg.exe and put it in the ffmpeg folder of this project. On macOS, use brew install ffmpeg to install.
Create an empty English directory. Open cmd in this directory on Windows (use terminal on macOS and Linux), and execute the command git clone https://github.com/jianchang512/zh_recogn ./
Continue to execute python -m venv venv, then execute .\venv\scripts\activate in Windows, and execute source ./venv/bin/activate in macOS and Linux.
Continue to execute pip install -r requirements.txt --no-deps
If you need cuda acceleration on Windows and Linux, continue to execute pip uninstall torch torchaudio, then execute pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118
Start the project python start.py

Pre-packaged Version / Win10 Win11 Only

Download address https://github.com/jianchang512/zh_recogn/releases

After downloading, unzip it to an English directory and double-click start.exe
To reduce the package size, the pre-packaged version does not support CUDA. If you need cuda acceleration, please deploy from source code.

Using in the pyvideotrans Project

First upgrade pyvideotrans to v1.62+, then open the upper left corner settings menu - zh_recogn Chinese Speech Recognition menu, fill in the address and port, default "http://127.0.0.1:9933", do not add /api at the end.

API

API address http://ip:prot/api default http://127.0.0.1:9933/api

Python code example for requesting the API

import requests

audio_file="D:/audio/1.wav"
file={"audio":open(audio_file,'rb')}
res=requests.post("http://127.0.0.1:9933/api",files=file,timeout=1800)

print(res.data)

[
	{
	 line:1,
	 time:"00:00:01,100 --> 00:00:03,300",
	 text:"Subtitle content 1"
	},
	{
	 line:2,
	 time:"00:00:04,100 --> 00:00:06,300",
	 text:"Subtitle content 2"
	},
]

Do not add /api at the end when filling in pyvideotrans.

Web Interface

Precautions

The model will be downloaded automatically for the first use, which will take a long time.
Only supports Chinese speech recognition.
You can modify the binding address and port in the set.ini file.

Using zh_recogn Chinese Speech Recognition ​

How to Use ​

Deploying the zh_recogn Project ​

Source Code Deployment ​

Pre-packaged Version / Win10 Win11 Only ​

Using in the pyvideotrans Project ​

API ​

Web Interface ​

Precautions ​