All Speech Recognition Model Downloads | pyVideoTrans-Open Source Video Translation Tool -pyvideotrans.com github.com/jianchang512/pyvideotrans

This page's content originates from my open-source project's release page. For the latest information, please visit GitHub: https://github.com/jianchang512/stt/releases/0.0

faster-whisper Model Downloads: Suitable for the faster-whisper mode in the stt project and the "pyvideotrans video translation dubbing" project. For openai-whisper models, please scroll down.

tiny 64MB tiny.en 64MB

base 124MB base.en 124MB

small 415MB small Baidu Netdisk small.en 415MB

medium 1.27G medium.en 1.27G

large-v1 Baidu Netdisk large-v1 huggingface

large-v2 huggingface large-v2 Baidu Netdisk

large-v3 huggingface large-v3 Baidu Netdisk

large-v3-turbo 1.3G

distil-whisper-small.en 282MB

distil-whisper-medium.en 671MB distil-medium Baidu Netdisk

distil-whisper-large-v2 1.27G distil-large-v2 Baidu Netdisk

distil-whisper-large-v3 1.3G distil-whisper-large-v3 Baidu Netdisk

After downloading, extract the contents. Copy the "models--Systran--faster-xx" folder from within the compressed package into the "models" directory. After extracting and copying, the folder list under the "models" directory should look like this:
Compressed package content
Folder list under the correctly placed "models" directory

openai-whisper Model Downloads: Only applicable for the openai-whisper mode of the "pyvideotrans video translation dubbing software."

After downloading, place the .pt files into the "models" folder within the software directory.

tiny.pt model tiny.en.pt model

base.pt model base.en.pt model

large-v3-turbo.pt model

FunASR Chinese Model Downloads

Baidu Netdisk Download (including speech recognition, punctuation restoration, and noise reduction models): https://pan.baidu.com/s/1v5wagiid6-K7GX9Pif4reA?pwd=y2ef

Huggingface (overseas download address): https://huggingface.co/spaces/mortimerme/s4/resolve/main/FunASR-Chinese-models.7z?download=true

After downloading and extracting, you will see 3 folders: iic, damo, .__temp. Copy these to the models/hub folder of the video translation software and overwrite.

cuBLASxx.dll and cuDNN Downloads

If you encounter "cublasxxx.dll does not exist" or experience crashes after enabling CUDA acceleration, please download the appropriate file. Copy the DLL files inside to C:/Windows/System32/ or the software's root directory (where the .exe is located).

Enter cmd in the address bar of any folder to open a black window, then type nvcc -V to check your current CUDA version.

Download CUDA 11.x version here: https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS.and.cuDNN_CUDA11_win_v4.7z

Download CUDA 12.x version here: https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS.and.cuDNN_CUDA12_win_v1.7z

uvr5 Model Download

Click here to download the uvr5 model

After downloading and extracting, you'll get a uvr5_weights folder. Copy this folder to the root directory of the video translation and dubbing software.

ffmpeg.exe Download

If you are using Windows and encounter a message indicating that the ffmpeg command is not found, download the following two files and place them in the software's root directory or in a folder named "ffmpeg" within the root directory.

https://github.com/jianchang512/stt/releases/download/0.0/ffmpeg.exe

https://github.com/jianchang512/stt/releases/download/0.0/ffprobe.exe

faster-whisper Model Downloads: Suitable for the faster-whisper mode in the stt project and the "pyvideotrans video translation dubbing" project. For openai-whisper models, please scroll down. ​

openai-whisper Model Downloads: Only applicable for the openai-whisper mode of the "pyvideotrans video translation dubbing software." ​

FunASR Chinese Model Downloads ​

cuBLASxx.dll and cuDNN Downloads ​

uvr5 Model Download ​

ffmpeg.exe Download ​