Skip to content

Advanced Settings Options Explained

In the top menu -- Tools/Options -- Advanced Options, you can customize some parameters for more precise control. See the figure below.

image-20240804220459698

Click on the text title on the left to display detailed instructions

Interface Language: Set the software interface language. You need to restart the software after modification. The default is to follow the operating system. zh represents Chinese, and en represents English.

Pause Countdown: When processing a single video translation, after recognizing the subtitles and after translating the subtitles, there will be a pause for a period of time. You can set the number of seconds to pause here.

Background Volume Multiplier: The background audio volume value is a multiple of the original. For example, if you fill in 0.8, the volume is reduced to 80% of the original.

Loop Background Audio: If the background audio duration is shorter than the video, whether to repeat the background audio. true means loop playback, and false means no loop playback.

302.ai Translation Model List: Fill in the model names used by 302.ai for translation, separated by English commas.

302.ai TTS Model List: Fill in the model names used by 302.ai for voiceover, separated by English commas.

ChatGPT Model List: Selectable chatGPT models, separated by English commas.

Gemini Model List: Gemini model list, separated by English commas.

Azure Model List: Selectable models, separated by English commas.

Local LLM Model List: Selectable models, separated by English commas.

ByteDance Volcano Inference Endpoint: Fill in the inference endpoint name created in ByteDance Volcano Ark. See https://pyvideotrans.com/zijiehuoshan for creation methods.

Video Transcoding Loss Control: Video transcoding loss control, 0=lowest loss, 51=highest loss, default is 13.

NVIDIA Use qp Instead of crf: Whether to use qp instead of crf to control video quality loss on NVIDIA graphics cards, true=yes, false=no.

Output Video Quality Control: Used to control output video quality and size, faster quality is worse.

Custom ffmpeg Command Parameters: Custom ffmpeg command parameters will be added in the second to last position, for example -bf 7 -b_ref_mode middle

264 or 265 Video Encoding: Fill in 264 to use libx264 encoding, and fill in 265 to use libx265 encoding. 264 has better compatibility, and 265 has a larger compression ratio and higher definition.

Audio Maximum Acceleration Multiple: The maximum acceleration multiple of the audio, the default is 3, that is, the maximum acceleration to 3 times the original speech speed, you need to set a number between 1-100, such as 3, which means the maximum acceleration is 3 times. Used to control the duration of the dubbed audio to be aligned with the original duration.

Video Slow Motion Multiple: Video slow motion multiple: a number greater than 1, which represents the maximum allowed slow motion multiple, 0 or 1 means no video slow motion, used to extend the video to align with the dubbing and subtitles.

Remove Dubbing Trailing Blank: Whether to remove the silent blank at the end of the dubbing, true=remove, false=do not remove.

Remove Subtitle Duration Greater Than Dubbing Duration: Whether to remove silence when the original subtitle duration is greater than the dubbing duration, for example, the original duration is 5s, and the dubbing is 3s, whether to remove the 2s silence, true=remove, false=do not remove.

Remove Silent Length Between 2 Subtitles: Remove the silent length between 2 subtitles in ms, for example, 100ms, that is, if the interval between two subtitles is greater than 100ms, 100ms will be removed, -1=completely remove

Force Modify Subtitle Timeline: true=force modify subtitle timeline to match the sound, false=do not modify, keep the original subtitle timeline, not modifying may cause the subtitles and sound to not match

Enable VAD: Enable VAD in faster-whisper subtitle overall recognition mode. true=enable, false=disable. Enabled by default.

Minimum Silent Segment: Minimum silent segment ms, default 250ms

Maximum Statement Duration Seconds: Maximum statement duration seconds, default 6s.

VAD Threshold: VAD threshold

VAD Pad Value: VAD pad value

Silent Segment During Equal Division: Silent segment during equal division mode, default 10s

Segment Duration During Equal Division: Segment duration in seconds for each segment in equal division mode

faster and openai Model List: Model name list under faster mode and openai mode, separated by English commas

CUDA Data Type: cuda data type in faster mode, int8=less resource consumption, fast speed, low precision, float32=more resource consumption, slow speed, high precision, int8_float16=device self-selection

whisper Model Prompt: Prompt sent to the whisper model

faster-whisper cpu Process: In faster mode, the number of cpu processes when recognizing subtitles

faster-whisper Worker Process: In faster mode, the number of concurrent worker processes when recognizing subtitles

Subtitle Recognition Accuracy Control 1: Precision adjustment during subtitle recognition, 1-5, 1=lowest video memory consumption, 5=highest video memory consumption

Subtitle Recognition Accuracy Control 2: Precision adjustment during subtitle recognition, 1-5, 1=lowest video memory consumption, 5=highest video memory consumption

faster-whisper Temperature Control: 0=occupies less GPU resources but the effect is slightly worse, 1=occupies more GPU resources and the effect is better

Context Awareness: true=occupies more GPU and the effect is better, false=occupies less GPU and the effect is slightly worse

Hard Subtitle Font Pixel: Hard subtitle font pixel size

Hard Subtitle Font Name: Font name for hard subtitles

Hard Subtitle Text Color: Set the color of the font. Note that the 6 characters after &H, every 2 letters represent the BGR color, that is, 2 bits of blue / 2 bits of green / 2 bits of red, which is the reverse of the common RGB color.

Hard Subtitle Text Border Color: Set the font border color. Note that the 6 characters after &H, every 2 letters represent the BGR color, that is, 2 bits of blue / 2 bits of green / 2 bits of red, which is the reverse of the common RGB color.

Hard Subtitle Move Up Distance: The subtitle is located at the bottom of the video by default. Here you can set a value greater than 0, which represents how much the subtitle moves up. Note that the maximum should not be greater than (video height - 20), that is, at least 20 of the height should be reserved for displaying subtitles, otherwise the subtitles will be invisible.

Re-segment After Faster/OpenAI-Whisper Recognition: If selected, nltk will be used to re-segment after recognition.

Number of Characters Per Line for Chinese, Japanese, and Korean: The number of characters in one line for Chinese, Japanese, and Korean hard subtitles. More than this will wrap to the next line. The default is 20 characters, which is also used as the basis for re-segmentation.

Number of Characters Per Line for Other Languages: The line length for hard subtitles in other languages. More than this number of characters will wrap to the next line. The default is 54 characters, which is also used as the basis for re-segmentation.

Subtitle Traditional to Simplified: Force the recognized traditional subtitles to be converted to simplified.

Number of Subtitles Translated Simultaneously: The number of subtitle entries translated simultaneously, default 15

Number of Translation Retries: The number of retries when a translation error occurs, the default is 2

Pause Time After Translation: Pause time after each translation / seconds, used to limit the request frequency

Number of Dubbed Subtitles Simultaneously: The number of subtitle entries dubbed simultaneously

AzureTTS Batch Lines: The number of lines for azureTTS one-time dubbing, the default is 150

ChatTTS Voice Tone Value: chatTTS voice tone value