Gemini AI is not only an excellent large language model for chatting, but also an excellent speech recognition and audio/video-to-text tool. It provides more than 1500 free calls per day, which can basically meet daily use needs.
How to Activate Gemini AI Service
First, you need to visit the Gemini AI online Studio page: https://aistudio.google.com/. You may want to try to see if you can open it.
Scientific Internet Access is a Prerequisite: This may be the only threshold for using Gemini AI. Sometimes, even if you have used scientific Internet access, opening the above website may still display a prompt of "Unsupported country or region".
At this time, you need to try switching VPN nodes until the page can correctly display the following interface:
Get API Key: In the upper left corner of the page shown in the figure above, you will see a Get API Key button. Click it and then create a new key.
Paste API Key: Paste the API Key you obtained into the pyVideoTrans software. The specific operation is to open the software's settings menu, find the "Gemini Pro Gemini Key" option, and paste the key in.
Use in Video Translation and Dubbing Software
First, please upgrade to v3.07 patch package version
- First, fill in your Key, the model used, and you can also modify the transcription prompt in Menu Bar--Translation Settings--Gemini pro.
- Don't forget the proxy/VPN, otherwise it will definitely go wrong.
- Select
Gemini Large Model Recognition
in the speech recognition channel, upload audio and video, select the pronunciation language, and do not selectChinese re-segmentation
. Gemini itself has a good sentence breaking effect. Selecting it may make the result worse.
- Just wait for the recognition result. If you are not satisfied, you can adjust the prompt words to modify again.