Adding New Models from Hugging Face
This document applies to the stt speech-to-text project: https://github.com/jianchang512/stt
From version 0.0.94 onwards, it is possible to add models compatible with faster-whisper/ctranslate2 from huggingface.co. This allows you to leverage specialized models, such as those trained for a specific language, to overcome the limitations of general-purpose models.
How to Add a Model
Upgrade to version 0.0.94.
Ensure you have access to and understand how to use a proxy server (i.e., you can "surf the web scientifically"). You need to know what a proxy and a proxy port are. If you don't meet this requirement, you should not attempt to add models, as accessing huggingface.co and downloading models both require a proxy.
Search for the desired model on https://huggingface.co/models. Make sure the model is compatible with faster-whisper/ctranslate2; otherwise, it will not work.
For example, I found this model: https://huggingface.co/zh-plus/faster-whisper-large-v2-japanese-5k-steps
Converted from clu-ling/whisper-large-v2-japanese-5k-steps using CTranslate2.
It is declared that ctranslate2 was used for conversion, so it can be used.
As shown in the image above, click to copy the ID. Then, open the
set.ini
file in the software directory, find themodel_list=
line, add a comma at the end, paste the copied ID, and save the changes.Open the software, fill in the network proxy address, select the newly pasted name from the model list, and click "Start."
If you are using a V2Ray-like software, the default proxy address is
http://127.0.0.1:10809
. If you are using a Clash-like software, the default proxy address ishttp://127.0.0.1:7890
.Note: The selected video language must match the language supported by the model you added. If you select a Japanese model but choose a Chinese video, you will not get the expected results.
After starting the process, if the model is not found locally during the subtitle recognition phase, it will automatically connect to huggingface.co to download it. Depending on your proxy situation, this may take a few minutes to tens of minutes. Please be patient.
As long as there are no red error messages, the download is in progress. If red error messages appear, it is usually a proxy problem, such as a slow or unstable proxy. The error code generally contains
Connection to huggingface.co timed out
or a string of numbers such as46573454354
indicating incomplete data.Note: If you deploy from source code, even if there is a proxy network error, it will only report errors like
No such file xxxx
.