Why Audio, Subtitles, and Video Fall Out of Sync
When translating between different languages, sentence lengths and pronunciation durations often change. For example, translating from Chinese to English typically results in different sentence lengths, and the time taken to speak the Chinese sentence versus the English sentence is usually different as well.
Chinese: 有多远滚多远
English: Get out of here as far as you can!
Chinese: 滚远点
Japanese: ここから出て行け。
If the original video has Chinese audio lasting 2 seconds, but the translated English audio takes 4 seconds, this inevitably causes synchronization issues.
How to Sync Them Without Caring About Quality, Just Synchronization
As mentioned above, if the original audio is 2 seconds and the translated audio is 4 seconds, and you only need synchronization without concern for speech speed or video playback rate, you can simply speed up the audio by 2 times. This reduces the 4-second duration to 2 seconds, achieving sync. Alternatively, slow down the video to extend the original 2-second segment to 4 seconds, which also aligns them.
Specific Steps for Audio Speed-Up to Achieve Sync:
- In the software interface, select "Auto Audio Speed-Up" and deselect "Auto Video Slow-Down".
- Go to Menu > Tools > Options, and set the maximum audio speed-up multiplier to
100
.
This will achieve synchronization, but the downside is obvious: speech speed becomes inconsistent.
Specific Steps for Video Slow-Down to Achieve Sync:
- Deselect "Auto Audio Speed-Up" in the software interface and select "Auto Video Slow-Down".
- Go to Menu > Tools > Options, and set the maximum video slow-down multiplier to 20.
This also achieves synchronization, keeping speech speed constant but making video playback inconsistent.
If you only want basic alignment without regard to quality, you can use these two methods.
Better and Acceptable Synchronization Methods
Clearly, the above synchronization methods are not practical, as overly fast audio or overly slow video are hard to accept and provide a poor experience. For better results, enable both "Auto Audio Speed-Up" and "Auto Video Slow-Down" simultaneously.
Specific Steps:
- When selecting Faster mode or OpenAI mode, try to use medium or larger models and choose "Full Recognition".
- In the software interface, select both "Auto Audio Speed-Up" and "Auto Video Slow-Down", and set a small overall speed-up value, such as 10%.
- Go to Menu > Tools > Options, and set the maximum audio speed-up multiplier to 1.8, meaning the speech speed can increase up to 1.8 times normal. You can manually adjust this to 2, 1.5, or any value greater than 1.
- Go to Menu > Tools > Options, and set the maximum video speed-up multiplier to 2, meaning the video can slow down to 0.05 times normal speed. You can change this to 3, 5, or any value greater than 1.
- Even after steps 1-3, synchronization might not be perfect due to the maximum limits. If alignment isn't achieved at the maximum, it may skip and proceed. You can further adjust the subtitle and video options in Menu > Tools > Options.
Is There a Perfect Synchronization Method?
Apart from manual intervention, such as refining translations or adding transitional scenes, there is currently no automated program that can achieve perfect synchronization.
Ensuring all of the following through automated processes—acceptable audio speed-up ranges, acceptable video slow-down ranges, and precise alignment of mouth movements with speech start times—across videos of varying lengths and in any language translation seems like an impossible task at present. There is no perfect method without manual adjustments.