Skip to content

ElevenLabs, known as the world's leading AI voice company, recently launched a speech recognition model called Scribe_v1, which supports transcribing audio into text for 99 languages.

Moreover, it offers a generous free quota, allowing uploads of up to 1GB of audio or video files per session.

Using in the video translation software pyVideoTrans
This article introduces two methods of use: online web usage.

Using in the Video Translation Software

  1. Upgrade to version v0.59 at https://pvt9.com/downpackage.
  2. Visit this page to create an API key: https://elevenlabs.io/app/settings/api-keys.
  3. In the video translation software, go to Menu → TTS Settings → Elevenlabs.io, paste your copied API key, and save it.
  4. Select Elevenlabs.io as the speech recognition channel to start using it.

Using on the Web

  1. Go to the webpage https://elevenlabs.io/app/speech-to-text. If you don't have an account, register with your email—no phone verification, card binding, or top-up required.
  2. After logging in, click Speech to text on the left side and follow the steps as shown below.
  3. Wait for the transcription to complete, then click on the displayed name to enter the transcription results page.