Two Key Factors Determining Quality:
The accuracy of the recognized text.
The quality of the translated text.
The accuracy of the text directly determines the quality of the translation, so improving translation quality requires addressing both of these aspects.
I: Improving Text Recognition Accuracy:
Use the large-v3 model.
From the base model, small model, medium model to the large-v3 model, the recognition accuracy gets better and better, but the consumption of computer resources also increases. If your computer has a high-performance NVIDIA graphics card and video memory greater than or equal to 8G, and you have configured the CUDA and cuDNN environment, you can try using the large-v3 model, which can significantly improve the accuracy of text subtitle recognition.
[View CUDA and cuDNN environment installation method](https://juejin.cn/post/7318704408727519270)
2. Separate the background sound in the video.
If there is a lot of background music or background noise in the video, it will definitely interfere with the text recognition effect. You can try selecting "Keep Background Sound", which will separate the background sound before recognition, and only use the human speech inside to recognize, the effect will be much better.
Of course, you can also use other third-party separation tools or the "Separate Human Voice Background" function on the left side of the software to separate the human voice and background sound in the video separately.
Then use the "Audio and Video to Text" function to separately recognize the subtitles of the human voice to obtain text subtitles.
Then, under "Text Subtitle Translation", translate the subtitle into the target language.
Then, in "Standard Function Mode", import the subtitle, add background music, and finally embed the dubbing and subtitles into the video. Although the steps are slightly cumbersome, it can significantly improve the translation effect.
3. Manually modify and adjust
After the subtitle recognition is completed, and after the translation is completed, the current complete text will be displayed in the subtitle area on the right side of the software. You can "click the pause" button to pause and then manually modify and adjust. No matter how accurate machine recognition and translation are, they will never be as good as manual proofreading.
II: Improving the Quality of Text Subtitle Translation
Among them, the best translation quality is ChatGPT/DeepL/Azure. These three all require paid accounts, but they do not support domestic users to pay, and ChatGPT/Azure also need to configure a proxy, which has a high threshold.
If you meet this condition, have a paid account and can configure a proxy, you can use these three translation channels to improve the translation quality (there are many transit proxy services available for ChatGPT in China).
The next best effects are Google/Gemini/Microsoft, these three are free, Google and Gemini need to configure a proxy, Microsoft does not need a proxy.
But it should be noted that Gemini has higher security restrictions. If your video dialogue content is rated, Gemini may refuse to translate it.
Again, you can choose Baidu Translate and Tencent Translate, and you need to apply for free keys and appids from their websites respectively. Among them, Tencent has a higher free quota, and Baidu has a very low free quota.
In summary, if the conditions are met, the first choice is ChatGPT/DeepL, then Google, then Microsoft, and finally Tencent Translate Baidu Translate.
Of course, you can also use DeepLx to get DeepL for free, but it is unstable and easily blocked IP.
Similarly, after the translation is completed, a pause button will also appear. Click pause, and the translation results can be manually checked and modified in the subtitle area on the right.