Adding New Models from Hugging Face
This document is for the stt speech-to-text project https://github.com/jianchang512/stt
Starting from version 0.0.94, you can add models compatible with faster-whisper/ctranslate2 from huggingface.co, such as models specifically for a certain language, to compensate for the shortcomings of general-purpose models.
How to Add
Upgrade to 0.0.94
Make sure you can access the internet via a proxy and know what a proxy and proxy port are. If this is not the case, do not add models, because accessing the huggingface.co website and downloading models require a proxy.
Search for the model you want to use from https://huggingface.co/models. Note that it must be a model compatible with faster-whisper/ctranslate2, otherwise it will not be usable.
For example, I found this model: https://huggingface.co/zh-plus/faster-whisper-large-v2-japanese-5k-steps
Converted from clu-ling/whisper-large-v2-japanese-5k-steps using CTranslate2.
It is declared that it was converted using ctranslate2, so it can be used
As shown in the figure above, click to copy the ID, then open
set.ini
in the software directory, find themodel_list=
line, add a comma at the end, and paste the ID you copied. Save the changes.Open the software, fill in the network proxy address, select the name you just pasted from the model list, and click Start.
If you are using a v2ray-like software, the default proxy address is
http://127.0.0.1:10809
. If you are using a clash-like software, the default proxy address ishttp://127.0.0.1:7890
.Note: The selected video language must be consistent with the language supported by the model you added. If you select a Japanese model but select a Chinese video, you will not get the expected results.
After the execution starts, if the model is not found locally during the subtitle recognition stage, it will automatically connect to huggingface.co for downloading. Depending on your proxy situation, it may take a few minutes to tens of minutes, so please be patient.
As long as there are no red errors, it is downloading. If a red error appears, it is basically a proxy problem, such as the proxy speed being too slow or the proxy being unstable. The error code usually contains
Connection to huggingface.co timed out
or a string of numbers46573454354
indicating incomplete data.Note: If the source code is deployed, even if the proxy network is faulty, it will only report errors like
No such file xxxx
.