With the rapid advancement of AI technology, the barrier to video translation has significantly lowered, making fully local, offline, and zero-cost solutions quite achievable.
However, the main challenges of local deployment are its complexity and hardware limitations, often resulting in smaller models and suboptimal translation quality. The full version of pyvideotrans
offers both local and online API solutions. While powerful, it can be daunting for beginners—even the download is a hurdle, with the installation package alone at 1.9GB (excluding models), and over 5GB when models are included.
To address these issues, following the simplified 302.ai
version, we've launched the Alibaba Bailian simplified version. This version requires no model downloads and has no special hardware requirements. Simply activate the service on Alibaba Cloud Bailian, obtain an API KEY, and quickly experience the convenience of video translation.
The simplified version includes features like video translation, speech recognition, subtitle dubbing, and subtitle translation, meeting basic daily needs.
Unlike the full version, all features in the simplified version rely on platform API services. After the free quota provided by the platform is exhausted, you'll need to pay for continued use. However, considering the ease of deployment, higher translation quality, and the decreasing cost of API services, it's a worthwhile option for efficiency-focused users.
Of course, if you prefer to avoid any payment, you can still use the fully-featured pyvideotrans
full version.
Download Links for Bailian Simplified Version
Baidu Netdisk: https://pan.baidu.com/s/1XsAt8Vt1_IccOKt0QAvC_g?pwd=6rgd
Comparison Table: Full Version vs. Bailian Simplified Version
Feature | pyvideotrans Full Version | pyvideotrans Bailian Simplified Version |
---|---|---|
Software Size | 1.9GB (without models), 5GB+ (with models) | 130MB |
Ease of Use | Complex setup, highly customizable | Simple to use, just fill in API KEY |
VPN Required | Required for Gemini, ChatGPT, Google channels | Not required |
Cost | Completely free, fully local and offline | Requires Alibaba Cloud Bailian service; payment needed after free quota |
Features | Powerful, supports all simplified features plus more | Only supports video translation, speech recognition, TTS, subtitle translation |
Voice Roles | Supports many, with API for third-party TTS services | Alibaba Bailian models support Chinese, English, German, Italian, Thai; built-in edge-tts for more languages |
How to Choose a Version:
pyvideotrans
Full Version is suitable for:- Users who want completely free usage.
- Those with some technical skills willing to tinker.
- Users who can use a VPN.
- Those wanting to explore and master more detailed features.
pyvideotrans
Bailian Simplified Version is suitable for:- Users who don't want to spend much effort on deployment and configuration, preferring simplicity.
- Users willing to pay for API services.
- Those unfamiliar with or unwilling to use a VPN.
Below are instructions on how to activate Alibaba Cloud Bailian and Alibaba Cloud OSS, as well as how to fill in the details in the software.
Step 1: Create an Alibaba Cloud Bailian API KEY
First, you need an Alibaba Cloud account with real-name verification.
Register, log in, and complete verification at: https://www.aliyun.com
Obtain the Alibaba Bailian API KEY.
After logging in, go directly to this URL to access the API KEY page: https://bailian.console.aliyun.com/?apiKey=1#/api-key
Create it as shown in the image.
After creation, view and copy the API KEY.
Most models come with a free quota.
Step 2: Create an Alibaba Cloud OSS Bucket
Why is this needed? Because Alibaba Cloud's speech recognition API does not support direct upload of audio/video files; it requires a network URL to download the file on its servers.
Instead of setting up your own server, the simplest way is to use Alibaba Cloud OSS, upload files there, and provide an internal network URL to the API, avoiding download traffic costs.
1. Log in to Alibaba Cloud and open the URL to activate OSS service.
Go directly to: https://oss.console.aliyun.com/overview If not activated, you'll be prompted to do so.
2. After activation, the interface will look like this; start by creating a Bucket.
Click Create Bucket
as shown below.
Note: You must select the region "China North 2 (Beijing)" for internal network usage.
Keep other settings as default.
3. Enable public read permissions.
This is necessary to allow access.
After creation, click Bucket List
in the top left, find your Bucket name, and enter its management interface.
Once inside, click Block Public Access
as shown.
By default, it's enabled; turn it off.
After confirming, click "Read/Write Permissions," then "Settings," and select "Public Read." Note: You must click "Settings" first to choose "Public Read."
After selecting "Public Read," a prompt will appear; click "Continue to Modify."
Then save the settings.
Don't worry about potential extra traffic fees; in the China North 2 (Beijing) region, access is internal, and uploaded files are only used during speech recognition. You can delete all files after completing video translation.
Step 3: Obtain AccessKey
To upload files to OSS, you need an AccessKey.
After creating OSS, go directly to: https://ram.console.aliyun.com/profile/access-keys
Follow the selection as shown; ignore any recommendations.
On the page, click "Create AccessKey" on the left.
You may need to verify your phone number. After verification, the AccessKey ID and AccessKey Secret will be displayed.
Remember these two pieces of information.
Step 4: Fill in Alibaba Bailian Details in the Software
Enter the OSS Bucket name, Bailian API KEY, AccessKey ID, and AccessKey Secret into the software, as shown below.
Alibaba Bailian Models Used in the Software
- During speech recognition (converting speech in audio/video to subtitles), the
SenseVoiceSmall
model is used, supporting over 20 languages with a free quota. - For speech synthesis (dubbing based on subtitles), a combination of
CosyVoice
,Sambert
, andedge-tts
is used.edge-tts
is Microsoft's free TTS service, whileCosyVoice
andSambert
are Alibaba Bailian TTS models with free quotas. - For subtitle translation, the Tongyi Qianwen large models are used:
qwen-plus-1125,qwen-plus-1127,qwen-turbo-1101,qwen-max,qwen-max-latest,qwen-plus,qwen2.5-72b-instruct
. Models ending in numbers have free quotas; others do not.
Important Notes
- If using video translation or audio/video to subtitle features, you must activate OSS and fill in the Bucket name and AccessKey; otherwise, these functions won't work.
- If other features work but speech recognition fails, it's likely due to not creating OSS or not enabling public read permissions on the Bucket.
- The video translation software itself is free to download and use; any costs from third-party APIs are separate and not related to the software.