With the rapid development of artificial intelligence technology, the barrier to video translation has been greatly reduced, and even achieving a completely local, offline, and zero-cost solution is not difficult.
However, the biggest challenges of local deployment solutions are complex deployment and limitations due to hardware performance. Models are often smaller, and the translation quality is difficult to optimize. The full version of pyvideotrans
provides both local and online API solutions. Although powerful, downloading it can be a challenge for beginners – the installation package without models is as large as 1.9GB, and the size increases to more than 5GB with the models.
To solve these problems, following the 302.ai
simplified version, we have launched the Alibaba Bailian simplified version. This version does not require downloading models and has no special requirements for hardware configuration. You only need to activate the service in Alibaba Cloud Bailian, obtain an API KEY, and you can quickly experience the convenience of video translation.
The simplified version includes video translation, speech recognition, subtitle dubbing, and subtitle translation, meeting basic daily needs.
Unlike the full version, the simplified version relies on the platform's API services. After the free quota provided by the platform is used up, you need to pay to continue using it. However, considering its convenient deployment and higher translation quality, as well as the decreasing price of API services, it is undoubtedly worthwhile for users pursuing efficiency.
Of course, if you are completely unwilling to consider paid options, you can still continue to use the fully functional pyvideotrans
full version.
Bailian Simplified Version Download Address
Baidu Netdisk: https://pan.baidu.com/s/1XsAt8Vt1_IccOKt0QAvC_g?pwd=6rgd
Comparison Table of Full Version and Bailian Simplified Version:
Feature | pyvideotrans Full Version | pyvideotrans Bailian Simplified Version |
---|---|---|
Software Size | 1.9GB without models, 5GB+ with models | 130MB |
Ease of Use | Complex configuration, high customizability | Simple to use, just fill in the API KEY |
VPN Required? | Required for Gemini, ChatGPT, Google channels | Not required |
Usage Cost | Can be completely free, completely local offline use | Need to activate Alibaba Cloud Bailian service, pay after the free quota is used up |
Functionality | Powerful, supports all functions of the simplified version plus more | Only supports video translation, speech recognition, speech synthesis, and subtitle translation |
Dubbing Roles | Supports more, can support more third-party TTS services through API | Alibaba Bailian model only supports Chinese, English, German, Italian, and Thai. Built-in edge-tts supports more other languages |
How to Choose a Version:
pyvideotrans
Full Version is suitable for:- Those who want to use it completely for free.
- Those who have a certain degree of hands-on ability and are willing to tinker.
- Those who can use a VPN.
- Those who want to deeply understand and master more detailed functions.
pyvideotrans
Bailian Simplified Version is suitable for:- Those who do not want to spend too much energy on deployment and configuration and just want to use it simply.
- Those who are willing to pay for API services.
- Those who are not familiar with or do not want to use a VPN.
The following are the operation instructions on how to activate Alibaba Cloud Bailian and Alibaba Cloud OSS, and how to fill in the information in the software.
1: Create an Alibaba Bailian API KEY
- First, you need to have an Alibaba Cloud account and pass real-name authentication.
Register, log in, and authenticate here: https://www.aliyun.com
- Get the API KEY for Alibaba Bailian
After logging in, directly open this address until you reach the API KEY acquisition page: https://bailian.console.aliyun.com/?apiKey=1#/api-key
Create directly as shown in the figure.
View and copy after creation.
Most models have a free quota.
2: Create an Alibaba Cloud OSS Bucket
Why is this thing still needed? Because Alibaba Cloud's speech recognition API does not support directly uploading audio and video files. You must pass the network URL address of the audio and video to it, and then it downloads the audio and video on the server through the URL for recognition.
It's not worth building a server for this. The easiest way is to directly use Alibaba Cloud OSS, upload it to OSS, and pass an intranet address to the API, which can also avoid generating download traffic.
1. After logging in to Alibaba Cloud, open the website to activate the OSS service.
Directly open this address: https://oss.console.aliyun.com/overview If you haven't activated it, you will be prompted to activate it.
2. After activation, the interface is as follows. Start creating a Bucket.
Click Create Bucket
as shown in the figure below.
Note: You must select the North China 2 (Beijing) region to use the intranet.
Keep other settings as default.
3. Enable Public Read permission.
This must be enabled, otherwise it cannot be accessed.
After successful creation, click Bucket List
in the upper left corner, find the name you just created, and click to enter the management interface of the Bucket.
After entering, as shown in the figure below, click Block Public Access
.
After clicking, as shown in the figure, the default is on. Turn it off.
After confirming that it is off, continue to click "Read/Write Permission", then click "Set", and then select "Public Read". Note that you need to click "Set" first before you can select "Public Read".
After selecting "Public Read", a prompt will pop up. Click "Continue Modification".
Then save it.
Don't worry about the extra traffic costs it reminds you of, because it is accessed through the intranet in the North China 2 (Beijing) node, and the uploaded files are only used by the intranet during the speech recognition stage. When you have finished the video translation work, you can delete all the uploaded files at any time.
3: Get AccessKey
To upload files to OSS, you need AccessKey
After creating OSS, directly open this address: https://ram.console.aliyun.com/profile/access-keys
Select as shown in the figure, ignoring its suggestions.
After entering the page, click "Create AccessKey" on the left.
Then you may need to verify your mobile phone number. After the verification is passed, the automatically created AccessKey ID and AccessKey Secret will be displayed.
Remember these 2 pieces of information.
4: Fill in the Alibaba Bailian information into the software
Fill in the OSS Bucket name, Bailian's API KEY, AccessKey ID, and AccessKey Secret created above into the software, as shown in the figure below.
Alibaba Bailian models used in the software
- In the speech recognition stage, that is, the stage of converting speech in audio and video into subtitles, the
SenseVoiceSmall
model is used, which supports more than 20 languages and has a certain free quota. - In the speech synthesis stage, that is, the dubbing stage according to the subtitles, a combination of
CosyVoice
,Sambert
, andedge-tts
is used, whereedge-tts
is Microsoft's free speech synthesis service, andCosyVoice
andSambert
are Alibaba Bailian's speech synthesis models and have a certain free quota. - In the subtitle translation stage, the Tongyi Qianwen large model
qwen-plus-1125,qwen-plus-1127,qwen-turbo-1101,qwen-max,qwen-max-latest,qwen-plus,qwen2.5-72b-instruct
is used. Models ending in numbers have a free quota, others do not.
Precautions
- If you use the video translation or audio and video to subtitle function, you must activate OSS and fill in the Bucket name and AccessKey, otherwise you will not be able to use it.
- If other functions are normal, but the audio and video to subtitle, that is, the speech recognition function is wrong, then it is likely because OSS was not created, or the Bucket's public read permission was not enabled.
- The video translation software itself is free to download and use. The costs incurred by third-party APIs have nothing to do with the software.