Skip to content

ElevenLabs, hailed as the best AI voice company globally, recently introduced a speech recognition model, scribe_v1, capable of transcribing audio into text across 99 languages.

It also offers a generous free tier, supporting uploads of audio or video files up to 1GB in size per instance.

Using with pyVideoTrans video translation software This article describes two usage methods: online web use.

Using in Video Translation Software

  1. Upgrade to version v0.59: https://pvt9.com/downpackage

  2. Go to this page to create an API key: https://elevenlabs.io/app/settings/api-keys

  3. In the video translation software, go to Menu--TTS Settings--Elevenlabs.io and enter the API key you copied, then save it.

  4. Select Elevenlabs.io in the speech recognition channel to use it.

Using on the Web

  1. Go to this webpage: https://elevenlabs.io/app/speech-to-text. If you don't have an account, please register with your email. No phone verification, card binding, or recharge is required.
  2. After logging in, click Speech to text on the left, as shown in the figure below.

  1. After the transcription is complete, click the displayed name to enter the transcription result page.