Skip to content

The content on this page is sourced from my open-source project release page. For the latest information, please visit GitHub: https://github.com/jianchang512/stt/releases/0.0

Faster-Whisper Model Downloads – Suitable for the stt project and the faster-whisper mode in "pyvideotrans video translation and dubbing" project. For OpenAI-Whisper models, scroll down.

image

tiny 64MBtiny.en 64MB

base 124MBbase.en 124MB

small 415MBsmall Baidu Netdisksmall.en 415MB

medium 1.27Gmedium.en 1.27G

large-v1 Baidu Netdisklarge-v1 HuggingFace

large-v2 HuggingFacelarge-v2 Baidu Netdisk

large-v3 HuggingFacelarge-v3 Baidu Netdisk

large-v3-turbo 1.3G

distil-whisper-small.en 282MB

distil-whisper-medium.en 671MBdistil-medium Baidu Netdisk

distil-whisper-large-v2 1.27Gdistil-large-v2 Baidu Netdisk

distil-whisper-large-v3 1.3Gdistil-whisper-large-v3 Baidu Netdisk

After downloading, extract the archive and copy the "models--Systran--faster-xx" folder inside to the models directory. After extraction and copying, the folder list under the models directory should look like this:

Archive contents:

image

Correct folder list under the models directory after placement:

image




OpenAI-Whisper Model Downloads – Only for use with the openai-whisper mode in "pyvideotrans video translation and dubbing software"

image

After downloading, place the .pt file into the models folder under the software directory. image

tiny.pt modeltiny.en.pt model

base.pt modelbase.en.pt model

small.pt model

small.en.pt model

medium.pt model

medium.en.pt model

large-v1.pt model

large-v2.pt model

large-v3.pt model

large-v3-turbo.pt model

image



FunASR Chinese Model Downloads

Baidu Netdisk download (includes speech recognition, punctuation restoration, and noise reduction models): https://pan.baidu.com/s/1v5wagiid6-K7GX9Pif4reA?pwd=y2ef

HuggingFace (external download link): https://huggingface.co/spaces/mortimerme/s4/resolve/main/FunASR-Chinese-models.7z?download=true

After downloading and extracting, you will see three folders: iic, damo, and .__temp. Copy them to the models/hub folder in the video translation software, overwriting if necessary.

image



cuBLASxx.dll and cuDNN Downloads

If you encounter "cublasxxx.dll not found" or the software crashes after enabling CUDA acceleration, download the corresponding file and copy the .dll files inside to either C:/Windows/System32 or the software's root directory (where the .exe is located).

Open a command prompt by typing cmd in any folder's address bar, then run the command nvcc -V to check your current CUDA version.

For CUDA 11.x, click here to download: https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS.and.cuDNN_CUDA11_win_v4.7z

For CUDA 12.x, click here to download: https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS.and.cuDNN_CUDA12_win_v1.7z



UVR5 Model Download

Click to download UVR5 model

After downloading and extracting, you will get a uvr5_weights folder. Copy this folder to the root directory of the video translation and dubbing software.



ffmpeg.exe Download

If you are on Windows and receive a "ffmpeg command not found" error, you can download the following two files and place them in the software's root directory or in the ffmpeg folder under the root directory.

https://github.com/jianchang512/stt/releases/download/0.0/ffmpeg.exe

https://github.com/jianchang512/stt/releases/download/0.0/ffprobe.exe