Adding New Models from Hugging Face
This document applies to the STT speech-to-text project: https://github.com/jianchang512/stt
From version 0.0.94 onwards, you can add models compatible with faster-whisper/ctranslate2 from huggingface.co. This allows you to use models specifically trained for a particular language, addressing the limitations of general-purpose models.
How to Add a Model
Update to version 0.0.94.
Make sure you can access the internet using a proxy. You should understand what a proxy and a proxy port are. If you don't, it's best not to proceed, as accessing huggingface.co and downloading models requires a proxy.
Search for the model you want to use on https://huggingface.co/models. Important: It must be a model compatible with faster-whisper/ctranslate2. Otherwise, it will not work.
For example, I found this model: https://huggingface.co/zh-plus/faster-whisper-large-v2-japanese-5k-steps
Converted from clu-ling/whisper-large-v2-japanese-5k-steps using CTranslate2.
It is stated that it was converted using ctranslate2, so it can be used.
As shown in the image above, click to copy the ID. Then, open the
set.ini
file in the software directory. Find themodel_list=
line, add a comma at the end, paste the ID you copied, and save the changes.Open the software, enter your network proxy address, select the name you just pasted from the model list, and click Start.
If you are using a v2ray-like software, the default proxy address is
http://127.0.0.1:10809
. If you are using a clash-like software, the default proxy address ishttp://127.0.0.1:7890
.Note: The selected video language must match the language supported by the model you added. If you select a Japanese model but choose a Chinese video, you will not get the desired result.
After you start the process, if the model is not found locally during the subtitle recognition stage, it will automatically connect to huggingface.co to download it. Depending on your proxy situation, this may take from a few minutes to tens of minutes. Please be patient.
As long as no red error appears, the download is in progress. If a red error appears, it is usually a proxy problem, such as a slow proxy speed or an unstable proxy connection. The error code generally contains
Connection to huggingface.co timed out
or a string of numbers like46573454354
indicating incomplete data.Note: If you are deploying from source code, even if there is a proxy network error, you will only get errors like
No such file xxxx
.