Skip to content

This is a powerful open-source video translation software dedicated to seamlessly converting the speech and subtitles of a video from one language to another. Whether you are a content creator, educator, or language learner, pyVideoTrans provides a one-stop solution to break down language barriers.

Core Features at a Glance

  • Fully Automated Video Translation: Intelligently recognizes speech in a video, generates source language subtitles, translates them to the target language, performs dubbing, and finally synthesizes the new audio and subtitles into the original video, all in one go.

  • Speech Recognition and Transcription: Accurately transcribes human speech from video or audio files in batches into SRT subtitle files with timestamps.

  • SRT Subtitle File Translation: Supports batch translation of SRT subtitle files, preserving the original timecodes and formatting, and offers various bilingual subtitle styles.

  • Text/Subtitle to Speech (TTS): Utilizes multiple advanced TTS services to generate high-quality, natural-sounding voiceovers for your text or SRT subtitle files.

  • Utility Toolkit: Includes a variety of auxiliary tools such as video/audio/subtitle merging and vocal/background sound separation to meet your various fine-tuning needs in video processing.

How the Software Works

Before you begin, it's crucial to understand the core working principle of this software:

pyVideoTrans works by recognizing and processing the [human speaking voice] in the video. It is completely independent of whether the video already has hardcoded subtitles.

  • Can process: Any video containing human speech, regardless of whether it has embedded subtitles.
  • Cannot process: Videos that only have background music and hardcoded subtitles, but no human speech. This software also cannot directly extract hardcoded subtitles from the video frames.

Download and Installation

1.1 For Windows Users (Pre-packaged Version)

We provide a ready-to-use pre-packaged version for Windows 10/11 users, eliminating the need for complex configuration.

Download URL: https://github.com/jianchang512/pyvideotrans/releases/v3.71

Unzipping Precautions

Incorrectly unzipping is the most common reason for the software failing to start. Please strictly follow these rules:

  1. Do not use paths requiring administrator privileges: Do not unzip to system folders like C:/Program Files, C:/Windows, or the Desktop.
  2. The path must be purely in English: The extraction path cannot contain any non-English characters, spaces, or special symbols.
  3. Recommended practice: On a non-system drive like D: or E:, create a new folder with a purely English or numeric name (e.g., D:/videotrans), and then extract the compressed package into this folder.

Example of extraction path

Launching the Software

After unzipping, enter the folder, find the sp.exe file, and double-click to run it. sp.exe

The software needs to load many modules on its first launch, which may take several tens of seconds. Please be patient.

1.2 For MacOS / Linux Users (Source Code Deployment)

For MacOS and Linux users, deployment from the source code is required.


Software Interface and Core Functions

After launching the software, you will see the main interface as shown below.

  • Top Menu Bar: Configure global settings.

    • Translation Settings: Configure API Keys and related parameters for various translation services (e.g., OpenAI, Azure).

    • TTS Settings: Configure API Keys and related parameters for various dubbing services (e.g., OpenAI TTS, Azure TTS).

    • Speech Recognition Settings: Configure API Keys and parameters for speech recognition services (e.g., OpenAI API, Alibaba ASR).

    • Tools/Options: Contains various advanced options and auxiliary tools, such as subtitle format adjustment, video merging, vocal separation, etc.

    • Help/About: View software version information, documentation, and community links.

  • Top Function Area: Switch between the main functional modules of the software, such as Custom Video Translation, Audio/Video to Subtitles, etc.

  • Right Workspace: The specific operation area for the currently selected functional module.


Quick Start - The Full Video Translation Process

This is the core function of the software. We will guide you step-by-step through a complete video translation task. The Custom Video Translation module is open by default.

Step 1: Select Video and Output Settings

  • Select videos to process: Click the button to select one or more video files (hold Ctrl for multiple selections).
  • Folder: Check this option to process all videos within an entire folder in batch.
  • Save to..: Set the output directory for the translated videos. By default, it's the _video_out folder in the original video's directory.
  • Clean generated: Check this if you need to reprocess the same video from scratch (instead of using the cache).
  • Save video only: If checked, only the final MP4 video will be kept after processing, and intermediate files like subtitles and audio will be automatically deleted.
  • Move subtitle position: If the original video has hardcoded subtitles, check this to try placing the new subtitles in a different location to avoid overlap.
  • Shutdown when complete: Automatically shut down the computer after all tasks are finished, suitable for large-batch, long-running tasks.

Step 2: Configure Translation and Dubbing

  • Translation Service: Choose the engine to use for translating subtitles.
    • Free: Google(Free) (requires proxy), Microsoft Translator (no proxy needed).
    • High-Quality (requires API Key): OpenAI, Gemini, DeepL, etc. API Keys are set in the corresponding section of the top menu bar.
  • Source Language: You must accurately select the language spoken by the people in the original video.
  • Target Language: The language you want to translate into.
  • Glossary: If checked, you can use a preset glossary for translation to ensure the accuracy of professional terms.
  • Network Proxy: If you are using a service that requires a proxy (like Google, OpenAI), enter your proxy address and port here (e.g., http://127.0.0.1:10808).
  • Dubbing Service: Choose the engine for generating the voiceover. Edge-TTS is the default option, free and with excellent results.
  • Voice Character: You must first select the target language to load and choose the corresponding voice (male/female, etc.).
  • Audition Voice: Click to preview the sound effect of the current character.
  • Dubbing Speed/Volume/Pitch: Adjust as needed. The values represent a percentage increase or decrease from the default.

Step 3: Configure Speech Recognition

This is the crucial step of converting video speech into text subtitles, which directly affects the quality of all subsequent processes.

  • Speech Recognition: It's recommended to use the default faster-whisper(local), which is free, runs locally, and performs excellently.
  • Select Model: The larger the model, the more accurate the recognition, but the slower the speed and the more resources it consumes.
    • Beginner: tiny / medium
    • Recommended: large-v3-turbo (great performance and speed, highly recommended to use with an NVIDIA graphics card and CUDA acceleration).
  • Voice Splitting Mode: It's recommended to use the default Overall Recognition.
  • Re-segment with LLM: If checked, a Large Language Model will be used to intelligently segment the recognized text and optimize punctuation, significantly improving subtitle readability.
  • Denoise: If checked, the audio will be denoised to improve speech recognition accuracy in noisy environments.

Step 4: Set Synchronization and Subtitles

Since different languages have different speaking rates, the duration of the translated dubbing may not match the original video. You can make adjustments here.

  • Synchronization Alignment:
    • Speed up dubbing: When the dubbing is longer than the video, speed up the dubbing to match the video duration (commonly used).
    • Slow down video: When the dubbing is longer than the video, slow down the video to match the dubbing duration.
    • Extend video: When the dubbing is longer than the video, add a still frame at the end of the video to match the dubbing duration.
  • Subtitle Embedding:
    • Do not embed subtitles: Only replace the audio, without adding any subtitles.
    • Embed hard subtitles: Permanently "burn" the subtitles into the video frames, making them impossible to turn off.
    • Embed soft subtitles: Package the subtitles as a separate track within the video, allowing the player to toggle them on or off.
    • (Bilingual): Embed both source and target language subtitles simultaneously.

Step 5: Process Background Audio

  • Keep original background audio: If checked, the software will attempt to separate the vocals and background sound from the original video and keep the background audio in the final video. Note: This feature significantly increases processing time but can greatly improve the quality of the final product.
  • Add extra background audio: You can also choose your own audio file to serve as new background music.
  • Background Volume: Adjust the volume of the background audio. Less than 1 decreases it, and greater than 1 increases it.

Step 6: Start Processing

  • CUDA Acceleration: If you have an NVIDIA graphics card and the CUDA environment is correctly installed, be sure to check this option. It can increase speech recognition speed by several or even dozens of times.

Once everything is set, click the 【Start】 button.

Processing

The software will begin its work. If you are processing only one video, it will pause after subtitle generation and translation, giving you a chance to proofread and edit the subtitles in the right-side text box. Click execute again to continue after confirming.

Step 7: Check the Results

After the task is complete, click on the progress bar area at the bottom to open the output folder. You will see the final MP4 file as well as the intermediate materials generated during the process, such as SRT subtitles and dubbing files.


Explore Other Useful Features

In addition to its core video translation function, pyVideoTrans offers several other powerful standalone features.

4.1 Audio/Video to Subtitles / Speech Transcription / Speech Recognition

Batch transcribe video or audio files into SRT subtitles. Simply drag and drop the files, set the source language and recognition model, and start. Supports advanced features like Re-segment with LLM and Denoise.

4.2 Batch Translate SRT Subtitles

If you already have SRT subtitle files, this feature can help you quickly translate them into other languages while preserving the timeline. It also supports various output formats like Monolingual subtitles, Bilingual (target language on top), and Bilingual (target language on bottom).

4.3 Batch Dubbing for Subtitles

Convert your SRT files or plain text into dubbing files (like WAV or MP3) in batch using your chosen TTS engine. Supports fine-tuning of speech rate, volume, and pitch.

4.4 Merge Audio, Video, and Subtitles

This is a practical post-production tool. When you have separate video, dubbing, and subtitle files, you can use it to perfectly merge them into a single final video file, with support for custom subtitle styles.


Chapter 5: Feature Overview and Supported Services

The power of pyVideoTrans lies in its high extensibility and support for a wide range of services.

  • Speech Recognition (STT) Support:

    • Local Offline: faster-whisper, openai-whisper
    • Online API: OpenAI SpeechToText, GoogleSpeech, Alibaba FunASR, Doubao Model, and custom APIs.
  • Subtitle Translation Support:

    • Microsoft Translator, Google Translate, Baidu Translate, Tencent Translate, DeepL, DeepLX, ByteDance Volcano
    • Large Language Models: ChatGPT, AzureAI, Gemini, other OpenAI-compatible AI models, and local LLMs
    • Offline Translation: OTT
  • Text-to-Speech (TTS) Support:

    • Microsoft Edge TTS, Google TTS, Azure AI TTS, OpenAI TTS, Elevenlabs
    • Voice Cloning/Local: GPT-SoVITS, clone-voice, ChatTTS, Fish TTS, CosyVoice, F5-TTS, KokoroTTS
    • Custom TTS Server API
  • Supported Languages:

    • Simplified/Traditional Chinese, English, Korean, Japanese, Russian, French, German, Italian, Spanish, Portuguese, Vietnamese, Thai, Arabic, Turkish, Hungarian, Hindi, Ukrainian, Kazakh, Indonesian, Malay, Czech, Polish, Dutch, Swedish, Filipino, Finnish, Persian, and more, with support for auto-detection.

Thank you for choosing pyVideoTrans. We hope this software becomes your powerful assistant in bridging language gaps