Skip to content

Adding New Models from Hugging Face

This document is for the stt speech-to-text project https://github.com/jianchang512/stt

Starting from version 0.0.94, you can add models compatible with faster-whisper/ctranslate2 from huggingface.co, such as models specifically for a certain language, to compensate for the shortcomings of general-purpose models.

How to Add

  1. Upgrade to 0.0.94

  2. Make sure you can access the internet via a proxy and know what a proxy and proxy port are. If this is not the case, do not add models, because accessing the huggingface.co website and downloading models require a proxy.

  3. Search for the model you want to use from https://huggingface.co/models. Note that it must be a model compatible with faster-whisper/ctranslate2, otherwise it will not be usable.

    For example, I found this model: https://huggingface.co/zh-plus/faster-whisper-large-v2-japanese-5k-steps

    Converted from clu-ling/whisper-large-v2-japanese-5k-steps using CTranslate2.

    It is declared that it was converted using ctranslate2, so it can be used

  4. As shown in the figure above, click to copy the ID, then open set.ini in the software directory, find the model_list= line, add a comma at the end, and paste the ID you copied. Save the changes.

  5. Open the software, fill in the network proxy address, select the name you just pasted from the model list, and click Start.

    If you are using a v2ray-like software, the default proxy address is http://127.0.0.1:10809. If you are using a clash-like software, the default proxy address is http://127.0.0.1:7890.

    Note: The selected video language must be consistent with the language supported by the model you added. If you select a Japanese model but select a Chinese video, you will not get the expected results.

  6. After the execution starts, if the model is not found locally during the subtitle recognition stage, it will automatically connect to huggingface.co for downloading. Depending on your proxy situation, it may take a few minutes to tens of minutes, so please be patient.

    As long as there are no red errors, it is downloading. If a red error appears, it is basically a proxy problem, such as the proxy speed being too slow or the proxy being unstable. The error code usually contains Connection to huggingface.co timed out or a string of numbers 46573454354 indicating incomplete data.

    Note: If the source code is deployed, even if the proxy network is faulty, it will only report errors like No such file xxxx.