Skip to content

This page's content originates from my open-source project's release page. For the latest information, please visit GitHub: https://github.com/jianchang512/stt/releases/0.0

faster-whisper Model Downloads: Suitable for the faster-whisper mode in the stt project and the "pyvideotrans video translation dubbing" project. For openai-whisper models, please scroll down.

image

tiny 64MBtiny.en 64MB

base 124MBbase.en 124MB

small 415MBsmall Baidu Netdisksmall.en 415MB

medium 1.27Gmedium.en 1.27G

large-v1 Baidu Netdisklarge-v1 huggingface

large-v2 huggingfacelarge-v2 Baidu Netdisk

large-v3 huggingfacelarge-v3 Baidu Netdisk

large-v3-turbo 1.3G

distil-whisper-small.en 282MB

distil-whisper-medium.en 671MBdistil-medium Baidu Netdisk

distil-whisper-large-v2 1.27Gdistil-large-v2 Baidu Netdisk

distil-whisper-large-v3 1.3Gdistil-whisper-large-v3 Baidu Netdisk

After downloading, extract the contents. Copy the "models--Systran--faster-xx" folder from within the compressed package into the "models" directory. After extracting and copying, the folder list under the "models" directory should look like this:

Compressed package content

image

Folder list under the correctly placed "models" directory

image




openai-whisper Model Downloads: Only applicable for the openai-whisper mode of the "pyvideotrans video translation dubbing software."

image

After downloading, place the .pt files into the "models" folder within the software directory. image

tiny.pt modeltiny.en.pt model

base.pt modelbase.en.pt model

small.pt model

small.en.pt model

medium.pt model

medium.en.pt model

large-v1.pt model

large-v2.pt model

large-v3.pt model

large-v3-turbo.pt model

image



FunASR Chinese Model Downloads

Baidu Netdisk Download (including speech recognition, punctuation restoration, and noise reduction models): https://pan.baidu.com/s/1v5wagiid6-K7GX9Pif4reA?pwd=y2ef

Huggingface (overseas download address): https://huggingface.co/spaces/mortimerme/s4/resolve/main/FunASR-Chinese-models.7z?download=true

After downloading and extracting, you will see 3 folders: iic, damo, .__temp. Copy these to the models/hub folder of the video translation software and overwrite.

image



cuBLASxx.dll and cuDNN Downloads

If you encounter "cublasxxx.dll does not exist" or experience crashes after enabling CUDA acceleration, please download the appropriate file. Copy the DLL files inside to C:/Windows/System32/ or the software's root directory (where the .exe is located).

Enter cmd in the address bar of any folder to open a black window, then type nvcc -V to check your current CUDA version.

Download CUDA 11.x version here: https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS.and.cuDNN_CUDA11_win_v4.7z

Download CUDA 12.x version here: https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS.and.cuDNN_CUDA12_win_v1.7z



uvr5 Model Download

Click here to download the uvr5 model

After downloading and extracting, you'll get a uvr5_weights folder. Copy this folder to the root directory of the video translation and dubbing software.



ffmpeg.exe Download

If you are using Windows and encounter a message indicating that the ffmpeg command is not found, download the following two files and place them in the software's root directory or in a folder named "ffmpeg" within the root directory.

https://github.com/jianchang512/stt/releases/download/0.0/ffmpeg.exe

https://github.com/jianchang512/stt/releases/download/0.0/ffprobe.exe