pyVideoTrans Common Issues and Solutions (FAQ)
To help you better use pyVideoTrans, we have compiled the following common issues and their solutions.
In the menu bar - Help/About, there are many links, such as model download addresses, CUDA configuration, etc. Try opening and using them if you encounter problems.

Part 1: Installation and Startup Issues
1. After double-clicking sp.exe, the software does not open or there is no response for a long time?
This is usually normal; please do not worry.
- Reason: This software is developed based on
PySide6, and the main interface contains many components. Initial loading during the first startup takes some time. Depending on your computer's performance, startup time may range from 5 seconds to 2 minutes. - Solutions:
- Wait patiently: After double-clicking, please wait for a while.
- Check security software: Some antivirus or security software may block the program from starting. Try temporarily disabling them or adding this software to the trust/whitelist.
- Check file path: Ensure the software's storage path contains only English letters and numbers; it should not include Chinese characters, spaces, or special symbols. For example,
D:\pyVideoTransis a good path, whileD:\program file\视频 工具may cause issues. - Upgrade package issue: If you cannot start after overwriting with an upgrade package, the operation was incorrect. Please re-download the complete software package, extract it, and then overwrite it with the new upgrade package.
2. What to do if prompted for missing python310.dll file at startup?
This issue indicates that you only downloaded the upgrade patch and not the main program.
- Solution:
- Please go to the official website to download the 2.5GB complete software package.
- Extract the complete package to a specified directory.
- Then download the latest upgrade patch and overwrite it into the complete package directory.
3. Does the software need to be installed?
This software is a portable version and does not require installation. After downloading the complete package, extract it and double-click sp.exe to run directly.
4. Why does antivirus software report a virus or block it?
- Reason: This software is packaged using the
PyInstallertool and does not have commercial digital signature certification. Some security software may issue risk warnings based on this, which is a common false positive. - Solutions:
- Add to trust: Add this software to your antivirus software's trust zone or whitelist.
- Run from source: If you are a developer, you can choose to deploy and run directly from the source code to completely avoid this issue.
5. Does the software support Windows 7?
No. Many core components that the software relies on no longer support Windows 7.
Part 2: Core Features and Settings
6. How to improve speech recognition accuracy?
Recognition accuracy mainly depends on the model size you choose.
- Model selection: In "faster" or "openai" mode, larger models provide higher accuracy but slower processing speed and higher resource consumption.
tiny: Smallest size, fastest speed, but lower accuracy.base/small/medium: Moderate effect and resource consumption; commonly used options.large-v3: Largest size, best effect, highest hardware requirements.
- Model download: All models can be downloaded from the official website: pyvideotrans.com/model
7. Why does the processed video have reduced clarity/quality?
Any operation involving re-encoding will inevitably lead to video quality loss. If you wish to preserve the original quality as much as possible, ensure all the following conditions are met:
- Original video format: Use the most compatible H.264 (libx264) encoded MP4 file.
- Disable slow processing: In the function options, do not check "Video Auto Slow".
- Do not embed hard subtitles: You can choose not to embed subtitles or only embed soft subtitles. Hard subtitles force re-encoding of the entire video.
- Do not change audio or duration: Do not perform dubbing, or if dubbing, disable the video end extension function.
8. How to configure a network proxy?
Some translation or dubbing services (e.g., Google, OpenAI, Gemini) are not directly accessible in certain regions and require a network proxy.
- Setup method: In the main interface's "Network Proxy Address" text box, enter your proxy service address.
- Format requirements: Usually in formats like
http://127.0.0.1:10808orsocks5://127.0.0.1:10808(port number should match your proxy client settings). - Important note: If you are unfamiliar with proxies or do not have an available proxy service, leave this field blank. Incorrect settings will cause all network functions (including domestic services) to fail.
9. How to customize subtitle font, color, and style?
- Open the menu bar
Tools/Options->Advanced Options->Hard Subtitle Related. - Here you can modify the hard subtitle's font, size, color, border style, etc.
- Color code explanation: Color codes are in the format
&HBBGGRR&, which is the reverse of common RGB, in the order Blue Green Red (BGR).- White:
&HFFFFFF& - Black:
&H000000& - Pure Red:
&H0000FF& - Pure Green:
&H00FF00& - Pure Blue:
&HFF0000&
- White:

Part 3: Common Issues and Troubleshooting
10. Why is there desynchronization between audio, subtitles, and video after processing?
This is normal in language translation.
- Reason: When expressing the same meaning in different languages, sentence length and pronunciation duration change. For example, a 2-second Chinese sentence might become a 4-second dubbing in English. This duration change can cause the dubbing to not perfectly align with the original video's lip movements and timeline.
11. Always prompted with insufficient VRAM (e.g., Unable to allocate error)?
This error means your graphics card does not have enough memory (VRAM) to perform the current task, usually due to using large models or processing high-resolution videos.
- Solutions (try in recommended order):
- Use a smaller model: Change the recognition model from
large-v3tomedium,small, orbase. Thelarge-v3model requires at least 8GB VRAM, but other programs also consume VRAM during runtime. - Adjust advanced settings: In the menu bar
Tools/Options->Advanced Options, make the following modifications to sacrifice some accuracy for lower VRAM usage:CUDA Data Type: Change fromfloat32tofloat16orint8.beam_size: Change from5to1.best_of: Change from5to1.Context: Change fromtruetofalse.
- Use a smaller model: Change the recognition model from
12. CUDA is installed, but why can't the software use GPU acceleration?
Check the following possible reasons:
- CUDA version incompatibility: The built-in CUDA support version in this software is 12.8. If your CUDA version is too low, it cannot be called.
- Outdated graphics driver: Please update your NVIDIA graphics driver to the latest version.
- Missing cuDNN: Ensure you have correctly installed cuDNN matching your CUDA version.
- Hardware incompatibility: GPU acceleration only supports NVIDIA graphics cards. AMD or Intel graphics cards cannot use CUDA.
13. Error during execution with messages containing “ffprobe exec error” or ffmpeg?
This error is usually related to file paths that are too long or contain special symbols.
- Reason: Windows systems have a maximum path length limit (usually 260 characters). If your video file name is very long (e.g., downloaded from YouTube) and stored in a deeply nested folder, the total path may exceed this limit.
- Solution: Move the video file to a shallower directory (e.g.,
D:\videos) and rename it to a short English or numeric name.
14. Software prompts that the video "contains no audio track"?
- Possible reason 1: The video indeed has no sound. For example, videos downloaded from YouTube and some other sites may have separate video and audio tracks, and errors during merging could result in lost audio.
- Possible reason 2: Excessive background noise. If the video environment is very noisy (e.g., streets, concerts), human speech may be masked, and the model might not detect valid speech.
- Possible reason 3: Incorrect language selection. Ensure the language selected in the "Original Speech" option matches the actual language spoken in the video. For example, if the video contains English dialogue, you must select "English" for correct recognition.
15. GPU usage is very low; is this normal?
Yes, it's normal. The software workflow is: Speech Recognition -> Text Translation -> Text-to-Speech -> Video Synthesis.
Only the first step, "Speech Recognition", heavily uses the GPU for computation. Other stages (e.g., translation, synthesis) primarily rely on the CPU, so low GPU load most of the time is expected.
16. Why do recognition results and subtitles remain unchanged when repeatedly processing the same video?
- Reason: To save time and computational resources, the software enables caching by default. If it detects that subtitles have already been generated for a video, it will use the cached results instead of reprocessing.
- Solution: If you wish to force re-recognition and translation, check the
Clean Generatedcheckbox in the top-left corner of the main interface.

17. After processing a few videos, the hard drive space is full?
This is usually due to enabling the "Video Slow" feature, which generates a large number of temporary files.
- Reason: This feature splits the video into many small segments based on subtitles and processes each segment, producing cache files that far exceed the original video's size.
- Solutions:
- Manual cleanup: After processing, manually delete all contents in the
tmpfolder in the software's root directory. - Automatic cleanup: The program automatically cleans these caches when closed normally.
- System cache: The software may also generate a folder named
pyvideotransin the user directory (defaultC:/Users/YourUsername/) or a folder named after theoutputdirectory in the software directory. You can manually delete it after closing the software.
- Manual cleanup: After processing, manually delete all contents in the
Part 4: General Information
18. Does the software support Docker deployment?
Currently not supported.
19. Can it recognize hard subtitles in the video (OCR function)?
No. This software works by analyzing the video's audio track to recognize human speech and convert it to text. It does not have image text recognition (OCR) functionality.
20. Can I add new language support?
No. Adding a language requires corresponding support from speech recognition channels, subtitle translation channels, and dubbing channels. Each channel corresponds to various local models or online APIs, and they may or may not support the new language. Even if supported, different methods may require different format codes for the same language (e.g., some channels require zh for Chinese, while others require zh-cn or chi). Arbitrarily adding languages can lead to unexpected errors. Unless you can code and modify the source code yourself, you cannot add languages.
21. Is the software free? Can it be used commercially?
- Cost: This project is free and open-source software; you can use all features for free. Note that if you use third-party translation, TTS (Text-to-Speech), or speech transcription interfaces, these service providers may charge fees, but this is unrelated to this software.
- Commercial use: Both individuals and companies are free to use this software. However, if you wish to integrate this project's code into your own commercial product, you must comply with the GPL-v3 open-source license. Additionally, some channels' models or online APIs may have their own license requirements regarding commercial use; please consult the corresponding platform for the channel you use, e.g., for Edge-TTS channel, consult Microsoft; for ChatTTS dubbing channel, consult
https://github.com/2noise/ChatTTS.
22. Is there customer support available?
No. This project is a free, open-source software developed by an individual, with no profit, so there is no dedicated customer support team. If you encounter issues, please read this FAQ carefully first. Alternatively, you can scan the WeChat QR code in the lower right corner of the software to make a donation and leave your WeChat ID for paid technical support.
23. Where to download the software and models?
- Software download address: pyvideotrans.com/downpackage
- Source code repository address: github.com/jianchang512/pyvideotrans
