Skip to content

With the rapid advancement of AI technology, the barrier to video translation has significantly lowered, making fully local, offline, and zero-cost solutions quite achievable.

However, the main challenges of local deployment are its complexity and hardware limitations, often resulting in smaller models and suboptimal translation quality. The full version of pyvideotrans offers both local and online API solutions. While powerful, it can be daunting for beginners—even the download is a hurdle, with the installation package alone at 1.9GB (excluding models), and over 5GB when models are included.

To address these issues, following the simplified 302.ai version, we've launched the Alibaba Bailian simplified version. This version requires no model downloads and has no special hardware requirements. Simply activate the service on Alibaba Cloud Bailian, obtain an API KEY, and quickly experience the convenience of video translation.

The simplified version includes features like video translation, speech recognition, subtitle dubbing, and subtitle translation, meeting basic daily needs.

Unlike the full version, all features in the simplified version rely on platform API services. After the free quota provided by the platform is exhausted, you'll need to pay for continued use. However, considering the ease of deployment, higher translation quality, and the decreasing cost of API services, it's a worthwhile option for efficiency-focused users.

Of course, if you prefer to avoid any payment, you can still use the fully-featured pyvideotrans full version.

Download Links for Bailian Simplified Version

Baidu Netdisk: https://pan.baidu.com/s/1XsAt8Vt1_IccOKt0QAvC_g?pwd=6rgd

Github: https://github.com/jianchang512/pyvideotrans/releases/download/v3.36/pyvideotrans-ali-bailian-3.88.7z

Comparison Table: Full Version vs. Bailian Simplified Version

Featurepyvideotrans Full Versionpyvideotrans Bailian Simplified Version
Software Size1.9GB (without models), 5GB+ (with models)130MB
Ease of UseComplex setup, highly customizableSimple to use, just fill in API KEY
VPN RequiredRequired for Gemini, ChatGPT, Google channelsNot required
CostCompletely free, fully local and offlineRequires Alibaba Cloud Bailian service; payment needed after free quota
FeaturesPowerful, supports all simplified features plus moreOnly supports video translation, speech recognition, TTS, subtitle translation
Voice RolesSupports many, with API for third-party TTS servicesAlibaba Bailian models support Chinese, English, German, Italian, Thai; built-in edge-tts for more languages

How to Choose a Version:

  • pyvideotrans Full Version is suitable for:

    • Users who want completely free usage.
    • Those with some technical skills willing to tinker.
    • Users who can use a VPN.
    • Those wanting to explore and master more detailed features.
  • pyvideotrans Bailian Simplified Version is suitable for:

    • Users who don't want to spend much effort on deployment and configuration, preferring simplicity.
    • Users willing to pay for API services.
    • Those unfamiliar with or unwilling to use a VPN.

Below are instructions on how to activate Alibaba Cloud Bailian and Alibaba Cloud OSS, as well as how to fill in the details in the software.

Step 1: Create an Alibaba Cloud Bailian API KEY

  1. First, you need an Alibaba Cloud account with real-name verification.

    Register, log in, and complete verification at: https://www.aliyun.com

  2. Obtain the Alibaba Bailian API KEY.

    After logging in, go directly to this URL to access the API KEY page: https://bailian.console.aliyun.com/?apiKey=1#/api-key

    Create it as shown in the image.

image.png

After creation, view and copy the API KEY.

image.png

Most models come with a free quota.

Step 2: Create an Alibaba Cloud OSS Bucket

Why is this needed? Because Alibaba Cloud's speech recognition API does not support direct upload of audio/video files; it requires a network URL to download the file on its servers.

Instead of setting up your own server, the simplest way is to use Alibaba Cloud OSS, upload files there, and provide an internal network URL to the API, avoiding download traffic costs.

1. Log in to Alibaba Cloud and open the URL to activate OSS service.

Go directly to: https://oss.console.aliyun.com/overview If not activated, you'll be prompted to do so.

2. After activation, the interface will look like this; start by creating a Bucket.

Click Create Bucket as shown below.

image.png

Note: You must select the region "China North 2 (Beijing)" for internal network usage.

image.png

Keep other settings as default.

3. Enable public read permissions.

This is necessary to allow access.

After creation, click Bucket List in the top left, find your Bucket name, and enter its management interface.

image.png

Once inside, click Block Public Access as shown.

image.png

By default, it's enabled; turn it off.

image.png

image.png

After confirming, click "Read/Write Permissions," then "Settings," and select "Public Read." Note: You must click "Settings" first to choose "Public Read."

image.png

After selecting "Public Read," a prompt will appear; click "Continue to Modify."

image.png

Then save the settings.

image.png

Don't worry about potential extra traffic fees; in the China North 2 (Beijing) region, access is internal, and uploaded files are only used during speech recognition. You can delete all files after completing video translation.

Step 3: Obtain AccessKey

To upload files to OSS, you need an AccessKey.

After creating OSS, go directly to: https://ram.console.aliyun.com/profile/access-keys

Follow the selection as shown; ignore any recommendations.

image.png

On the page, click "Create AccessKey" on the left.

image.png

You may need to verify your phone number. After verification, the AccessKey ID and AccessKey Secret will be displayed.

image.png

image.png

Remember these two pieces of information.

Step 4: Fill in Alibaba Bailian Details in the Software

Enter the OSS Bucket name, Bailian API KEY, AccessKey ID, and AccessKey Secret into the software, as shown below.

image.png

Alibaba Bailian Models Used in the Software

  1. During speech recognition (converting speech in audio/video to subtitles), the SenseVoiceSmall model is used, supporting over 20 languages with a free quota.
  2. For speech synthesis (dubbing based on subtitles), a combination of CosyVoice, Sambert, and edge-tts is used. edge-tts is Microsoft's free TTS service, while CosyVoice and Sambert are Alibaba Bailian TTS models with free quotas.
  3. For subtitle translation, the Tongyi Qianwen large models are used: qwen-plus-1125,qwen-plus-1127,qwen-turbo-1101,qwen-max,qwen-max-latest,qwen-plus,qwen2.5-72b-instruct. Models ending in numbers have free quotas; others do not.

Important Notes

  1. If using video translation or audio/video to subtitle features, you must activate OSS and fill in the Bucket name and AccessKey; otherwise, these functions won't work.
  2. If other features work but speech recognition fails, it's likely due to not creating OSS or not enabling public read permissions on the Bucket.
  3. The video translation software itself is free to download and use; any costs from third-party APIs are separate and not related to the software.