Advanced Settings Options Explained
In the top menu -- Tools/Options -- Advanced Options, you can customize some parameters to achieve finer control. As shown in the figure below.
Click on the text title on the left to display detailed instructions.
Interface Language: Set the software interface language. The software needs to be restarted after modification. The default is to follow the operating system. zh
represents Chinese, and en
represents English.
Pause Countdown: When processing a single video translation, there is a pause after the subtitles are recognized and after the subtitles are translated. You can set the pause duration in seconds here.
Background Volume Multiplier: The background audio volume value is a multiple of the original. For example, if you fill in 0.8
, the volume will be reduced to 80% of the original.
Loop Play Background Sound: If the background audio duration is shorter than the video, whether to repeat the background sound. true
means loop play, and false
means no loop.
302.ai Translation Model List: Fill in the model names used by 302.ai for translation, separated by English commas.
302.ai TTS Model List: Fill in the model names used by 302.ai for dubbing, separated by English commas.
ChatGPT Model List: Available chatGPT models, separated by English commas.
Gemini Model List: Gemini model list, separated by English commas.
Azure Model List: Available models, separated by English commas.
Local LLM Model List: Available models, separated by English commas.
Byte Volcano Inference Endpoint: Fill in the name of the inference endpoint created in Byte Volcano Ark. See https://pyvideotrans.com/zijiehuoshan for the creation method.
Video Transcoding Loss Control: Loss control during video transcoding. 0 = lowest loss, 51 = highest loss, default 13.
NVIDIA Use qp Instead of crf: Whether to use qp instead of crf to control video quality loss on NVIDIA graphics cards. true = yes, false = no.
Output Video Quality Control: Used to control the quality and size of the output video. The faster, the worse the quality.
Custom ffmpeg Command Parameters: Custom ffmpeg command parameters will be added in the second to last position, for example -bf 7 -b_ref_mode middle
264 or 265 Video Encoding: Fill in 264 to indicate the use of libx264 encoding, and fill in 265 to indicate the use of libx265 encoding. 264 has better compatibility, and 265 has a larger compression ratio and higher definition.
Audio Maximum Acceleration Multiple: The maximum acceleration multiple of the audio, the default is 3, that is, the maximum acceleration to 3 times the original speech speed. You need to set a number from 1-100, such as 3, which represents a maximum acceleration of 3 times. Used to control the duration after dubbing to align with the original duration.
Video Slow Speed Multiple: Video slow speed multiple: a number greater than 1, which represents the maximum allowable slow speed multiple. 0 or 1 means no video slow motion, which is used to extend the video to align with dubbing and subtitles.
Remove Dubbing Trailing Whitespace: Whether to remove the silent whitespace at the end of the dubbing, true = remove, false = do not remove.
Remove Subtitle Duration Greater Than Dubbing Duration: Whether to remove the silence where the original subtitle duration is greater than the dubbing duration, for example, the original duration is 5s, and the dubbing is 3s. Whether to remove these 2s of silence, true = remove, false = do not remove
Remove Silent Length Between 2 Subtitles: Remove the silent length ms between 2 subtitles, for example, 100ms, that is, if the interval between two subtitles is greater than 100ms, 100ms will be removed, -1 = completely remove
Force Modify Subtitle Timeline: true = force modify the subtitle timeline to match the sound, false = do not modify, keep the original subtitle timeline, not modifying may cause the subtitles and sound to not match
Enable VAD: Enable VAD in faster-whisper subtitle overall recognition mode. true = enable, false = disable. Enabled by default
Minimum Silent Fragment: Minimum silent fragment ms, default 250ms
Maximum Duration of Statement: Maximum duration of statement in seconds, default 6s.
VAD Threshold: VAD threshold
VAD pad Value: VAD pad value
Silent Fragment During Equal Segmentation: Silent fragment during equal segmentation mode, default 10s
Fragment Duration During Equal Segmentation: Duration of each fragment in seconds during equal segmentation mode
Model List of Faster and OpenAI: The model name list of faster mode and openai mode, separated by English commas
CUDA Data Type: CUDA data type in faster mode, int8 = less resource consumption, fast speed, low accuracy, float32 = more resource consumption, slow speed, high accuracy, int8_float16 = device-selected
Whisper Model Prompt Words: Prompt words sent to the whisper model
Faster-Whisper CPU Process: The number of CPU processes when recognizing subtitles in faster mode
Faster-Whisper Worker Process: The number of simultaneous working processes when recognizing subtitles in faster mode in faster mode
Subtitle Recognition Accuracy Control 1: Accuracy adjustment during subtitle recognition, 1-5, 1 = lowest video memory consumption, 5 = most video memory consumption
Subtitle Recognition Accuracy Control 2: Accuracy adjustment during subtitle recognition, 1-5, 1 = lowest video memory consumption, 5 = most video memory consumption
Faster-Whisper Temperature Control: 0 = occupies less GPU resources but has slightly worse effect, 1 = occupies more GPU resources and has better effect
Context Awareness: true = occupies more GPU and has better effect, false = occupies less GPU and has slightly worse effect
Hard Subtitle Font Pixel: Hard subtitle font pixel size
Hard Subtitle Font Name: Hard subtitle font name
Hard Subtitle Text Color: Set the color of the font, pay attention to the 6 characters after &H, every 2 letters represent the BGR color, that is, 2 digits of blue / 2 digits of green / 2 digits of red, which is reversed from the common RGB color.
Hard Subtitle Text Border Color: Set the border color of the font, pay attention to the 6 characters after &H, every 2 letters represent the BGR color, that is, 2 digits of blue / 2 digits of green / 2 digits of red, which is reversed from the common RGB color.
Hard Subtitle Move Up Distance: The subtitles are located at the bottom of the video by default. Here you can set a value greater than 0, which represents how much distance the subtitles move up. Note that the maximum cannot be greater than (video height - 20), that is, at least 20 of height must be reserved for displaying subtitles, otherwise the subtitles will be invisible.
Re-Punctuated After Faster/OpenAI-Whisper Recognition: If selected, the nltk will be used to re-punctuate after recognition.
Number of Characters Per Line in Chinese, Japanese, and Korean: The number of characters in a line for Chinese, Japanese, and Korean hard subtitles. If it exceeds this number, it will wrap to the next line. The default is 20 characters, which is also used as the basis for re-punctuation.
Number of Characters Per Line in Other Languages: The line length for other languages when hard subtitles are used. If the number of characters exceeds this number, it will wrap to the next line. The default is 54 characters, which is also used as the basis for re-punctuation.
Subtitle Traditional to Simplified: Force the recognized traditional subtitles to be converted to simplified Chinese
Number of Subtitles Translated Simultaneously: The number of subtitle lines translated simultaneously, default 15
Number of Retries for Translation Errors: The number of retries when a translation error occurs, default 2
Pause Time After Translation: Pause time/second after each translation, used to limit the request frequency
Number of Subtitle Lines Dubbed Simultaneously: The number of subtitle lines dubbed simultaneously
AzureTTS Batch Lines: The number of lines dubbed by AzureTTS at a time, default 150
ChatTTS Voice Tone Value: ChatTTS voice tone value