Skip to content

CLI Command Line Mode

Open In Colab

cli.py is the command-line execution script. Running python cli.py is the simplest way to execute it.

Accepted parameters:

-m Absolute path to the MP4 video

Specific configuration parameters can be set in cli.ini, located in the same directory as cli.py. Other MP4 video files to be processed can also be configured via the command-line parameter -m absolute path to the MP4 video, for example: python cli.py -m D:/1.mp4.

cli.ini contains all the complete parameters. The first parameter, source_mp4, represents the video to be processed. If the -m parameter is passed via the command line, the command-line value is used; otherwise, source_mp4 is used.

-c Configuration file path

You can copy cli.ini to another location and specify the configuration file to use via the command line with -c absolute path to cli.ini, for example: python cli.py -c E:/conf/cli.ini. This will use the configuration in that file and ignore the configuration file in the project directory.

-cuda does not require a value; simply adding it enables CUDA acceleration (if available): python cli.py -cuda

Example: python cli.py -cuda -m D:/1.mp4

Specific Parameters and Descriptions in cli.ini

; Command-line parameters
; Absolute path to the video to be processed, use forward slashes as path separators. Can also be passed via the -m command-line parameter.
source_mp4=
; Network proxy address, required for Google or official ChatGPT in China
proxy=
; Output result file directory
target_dir=
; Video pronunciation language, choose from: zh-cn zh-tw en fr de ja ko ru es th it pt vi ar tr
source_language=zh-cn
; Speech recognition language, no need to fill
detect_language=
; Target translation language: zh-cn zh-tw en fr de ja ko ru es th it pt vi ar tr
target_language=en
; Subtitle language for soft subtitles, no need to fill
subtitle_language=
; true = Enable CUDA
cuda=false
; Voice role name, for openaiTTS roles: "alloy, echo, fable, onyx, nova, shimmer"; for edgeTTS, find roles in voice_list.json for the corresponding language; for elevenlabsTTS, find roles in elevenlabs.json
voice_role=en-CA-ClaraNeural
; Voice speed adjustment, must start with + or -, + for speed up, - for slow down, ending with %
voice_rate=+0%
; Options: edgetTTS, openaiTTS, elevenlabsTTS
tts_type=edgeTTS
; Silence segment, in milliseconds
voice_silence=500
; Whether to keep background music, true = yes, very slow
is_separate=false
; all = overall recognition, split = pre-split audio segments for recognition
whisper_type=all
; Speech recognition model options: base, small, medium, large-v3
whisper_model=base
model_type=faster
; Translation service, options: google, baidu, chatGPT, Azure, Gemini, tencent, DeepL, DeepLX
translate_type=google
; 0 = no subtitles, 1 = hard subtitles, 2 = soft subtitles
subtitle_type=1
; true = automatic voice speed adjustment
voice_autorate=false

; DeepL translation API address
deepl_authkey=asdgasg
; Self-configured DeepLX service API address
deeplx_address=http://127.0.0.1:1188
; Tencent translation ID
tencent_SecretId=
; Tencent translation key
tencent_SecretKey=
; Baidu translation ID
baidu_appid=
; Baidu translation secret key
baidu_miyue=
; ElevenlabsTTS key
elevenlabstts_key=
; ChatGPT API address, ending with /v1, can be a third-party API address
chatgpt_api=
; ChatGPT key
chatgpt_key=
; ChatGPT model, options: gpt-3.5-turbo, gpt-4
chatgpt_model=gpt-3.5-turbo
; Azure API address
azure_api=
; Azure key
azure_key=
; Azure model name, options: gpt-3.5-turbo, gpt-4
azure_model=gpt-3.5-turbo
openaitts_role=alloy,echo,fable,onyx,nova,shimmer

gemini_key=
gemini_template=

clone_api=
ttsapi_url=
ttsapi_voice_role=
ttsapi_extra=pyvideotrans

trans_api_url=
trans_secret=

gptsovits_url=
gptsovits_role=
gptsovits_extra=pyvideotrans
; Google Gemini key
gemini_key=
back_audio=
only_video=
auto_ajust=false