Skip to content

With the rapid development of artificial intelligence technology, the barrier to entry for video translation has been greatly reduced. Even a completely local, offline, and zero-cost solution is not difficult to achieve.

However, the biggest challenge with local deployment solutions is the complexity of deployment and the limitation of hardware performance. Models are often smaller, and the translation quality is difficult to optimize. The full version of pyvideotrans provides both local and online API solutions. Although powerful, downloading it can be a challenge for beginners - the installation package without models is as large as 1.9GB, and the volume exceeds 5GB with the models.

To solve these problems, following the 302.ai Lite version, we have launched the Alibaba Bailian Lite version. This version does not require downloading models and has no special hardware requirements. You only need to activate the service in Alibaba Cloud Bailian and obtain the API KEY to quickly experience the convenience of video translation.

The Lite version includes video translation, speech recognition, subtitle dubbing, and subtitle translation, which meet basic daily needs.

Unlike the full version, the Lite version's functions all rely on the platform's API services. After the platform's free quota is used up, you need to pay to continue using it. However, considering its convenient deployment and higher translation quality, as well as the decreasing price of API services, it is undoubtedly worthwhile for users pursuing efficiency.

Of course, if you do not consider paid options at all, you can continue to use the full-featured pyvideotrans full version.

Bailian Lite Download Address

Baidu Netdisk: https://pan.baidu.com/s/1XsAt8Vt1_IccOKt0QAvC_g?pwd=6rgd

Github: https://github.com/jianchang512/pyvideotrans/releases/download/v3.36/pyvideotrans-ali-bailian-3.88.7z

Comparison Table of Full Version and Bailian Lite Version:

Featurepyvideotrans Full Versionpyvideotrans Bailian Lite Version
Software Size1.9GB without models, 5GB+ with models130MB
Ease of UseComplex configuration, high customizabilitySimple to use, just fill in the API KEY
VPN Required?Required for Gemini, ChatGPT, Google channelsNot required
Usage CostCan be completely free, fully local offline useNeed to activate Alibaba Cloud Bailian service, pay after free quota is used up
FeaturesPowerful, supports all functions of the Lite version plus moreOnly supports video translation, speech recognition, speech synthesis, and subtitle translation
Dubbing RolesSupports more, can support more third-party TTS services via APIAlibaba Bailian model only supports Chinese, English, German, Italian, and Thai, built-in edge-tts can support more languages

How to Choose a Version:

  • pyvideotrans Full Version is suitable for:

    • Wanting to use it completely free.
    • Having a certain level of hands-on ability and willingness to tinker.
    • Being able to use a VPN.
    • Wanting to deeply understand and master more detailed functions.
  • pyvideotrans Bailian Lite Version is suitable for:

    • Not wanting to spend too much effort on deployment and configuration, just wanting simple use.
    • Willing to pay for API services.
    • Unfamiliar with or unwilling to use a VPN.

The following are the operating instructions for how to activate Alibaba Cloud Bailian and Alibaba Cloud OSS, and the filling instructions in the software.

I: Create Alibaba Bailian API KEY

  1. First, you need to have an Alibaba Cloud account and be real-name authenticated.

    Go to this to register, log in, and authenticate: https://www.aliyun.com

  2. Get the API KEY of Alibaba Bailian

After logging in, directly open this address until the API KEY acquisition page https://bailian.console.aliyun.com/?apiKey=1#/api-key

Create directly as shown in the figure

image.png

View and copy it after creation

image.png

Most models have a free quota.

II: Create Alibaba Cloud OSS Bucket

Why do I need this thing? Because Alibaba Cloud's speech recognition API does not support directly uploading audio and video files. You must pass the network URL address of the audio and video to it, and then it downloads the audio and video on the server for recognition through the URL.

It's not worth building a server specifically for this. The easiest way is to directly use Alibaba Cloud OSS, upload to OSS, and pass an intranet address to the API, which can also avoid generating download traffic.

1. After logging in to Alibaba Cloud, open the website to activate the OSS service

Directly open this address https://oss.console.aliyun.com/overview If you have not activated it, you will be prompted to activate it

2. After activation, the interface is as follows. Start creating a Bucket

Click Create Bucket as shown in the figure

image.png

Note: You must select the North China 2 (Beijing) region to use the intranet

image.png

Keep other settings as default.

3. Enable Public Read Permissions

This must be enabled, otherwise it cannot be accessed

After successful creation, click Bucket List in the upper left corner, find the name you just created, and click to enter the management interface of the Bucket

image.png

After entering, click Block Public Access as shown in the figure

image.png

After clicking, it is enabled by default. Turn it off.

image.png

image.png

After confirming the closure, continue to click "Read and Write Permissions", then click "Settings", and then select "Public Read" Note that you need to click "Settings" first before you can select "Public Read"

image.png

After selecting "Public Read", a prompt pops up, click "Continue Modification"

image.png

Then save it

image.png

Don't worry about the extra traffic costs it prompts, because the North China 2 (Beijing) node is accessed through the intranet, and the uploaded files are only used by the intranet during the speech recognition stage. After you have finished the video translation, you can delete all the uploaded files at any time.

III: Get AccessKey

To upload files to OSS, you need AccessKey

After creating OSS, directly open this address https://ram.console.aliyun.com/profile/access-keys

Select according to the following figure, ignoring its suggestions.

image.png

After entering the page, click "Create AccessKey" on the left

image.png

Then you may need to verify your mobile phone number. After the verification is passed, the automatically created AccessKey ID and AccessKey Secret will be displayed.

image.png

image.png

Remember these 2 pieces of information.

IV: Fill in the Alibaba Bailian information into the software

Fill in the OSS Bucekt name, Bailian's API KEY, AccessKey ID and AccessKey Secret created above into the software, as shown in the figure below.

image.png

Alibaba Bailian models used in the software

  1. In the speech recognition stage, that is, the stage of converting the speech in the audio and video into subtitles, the SenseVoiceSmall model is used, which supports more than 20 languages and has a certain free quota.
  2. In the speech synthesis stage, that is, the stage of dubbing according to the subtitles, a combination of CosyVoice, Sambert, and edge-tts is used. Among them, edge-tts is Microsoft's free speech synthesis service, and CosyVoice and Sambert are Alibaba Bailian's speech synthesis models, which have a certain free quota.
  3. In the subtitle translation stage, the Tongyi Qianwen large model qwen-plus-1125,qwen-plus-1127,qwen-turbo-1101,qwen-max,qwen-max-latest,qwen-plus,qwen2.5-72b-instruct is used. Models ending with numbers have a free quota, others do not.

Precautions

  1. If you use video translation or audio and video to subtitle functions, you must activate OSS and fill in the Bucket name and AccessKey, otherwise you cannot use it.
  2. If other functions are normal, but the audio and video to subtitle or speech recognition function has an error, then it is likely that you have not created OSS, or have not enabled public read permission for the Bucket.
  3. The video translation software itself is free to download and use. The fees generated by third-party APIs are not related to the software.