Skip to content

This article compiles various translation, dubbing, and speech recognition channels, categorized into free and paid options.

It also recommends optimal combinations based on usage environments (such as whether a VPN is used), ensuring you can find suitable tools in different situations.

Completely Free Options

Translation Channels

  • Without VPN or Proxy

    • First choice: DeepSeek/Zhipu AI as the translation channel. Apply for accounts like "DeepSeek" or "Zhipu AI," obtain an SK, and fill it in the "DeepSeek or Zhipu AI" section in the translation settings. Second choice: Microsoft Translator.
  • With VPN and Proxy

    • First choice: Gemini AI Translation, followed by Google Translate.

Dubbing Channels

  • First choice: edge-TTS, free and requires no setup, supports all languages.
  • When the target language is Chinese, first choice: "GPT-SoVITS," "F5-TTS," "CosyVoice," and other dubbing channels.
  • When the target language is other languages, first choice: edge-TTS.

Speech Recognition Channels

  • When the video language is Chinese

    • First choice: Ali FunASR, Alibaba's FunASR series Chinese model, which performs better than Whisper.
    • Second choice: faster-whisper or openai-whisper (local), select the "large-v2" model, and choose "whole recognition" for speech segmentation mode.
    • For single-line characters in Chinese, Japanese, or Korean, default to splitting every 20 characters into one subtitle, which can be adjusted as needed.
  • When the video language is English or other languages

    • First choice: faster-whisper or openai-whisper (local), select the "large-v2" or "large-v3-turbo" model, and choose "whole recognition" for speech segmentation mode.
  • When the video language is a less common language

    • First choice: Gemini Large Model Recognition, with speech segmentation mode set to "whole recognition."

Note: Gemini is not available in all countries. If prompted that the current country is not supported, switch VPN nodes, preferably to Singapore or Japan. Alternatively, choose Google Translate.

Completely Paid Options

If higher translation quality is desired, third-party paid APIs can be selected.

Translation Channels

  • OpenAI ChatGPT (latest models), Gemini, 302.AI, domestic AI (e.g., DeepSeek, Zhipu AI).

Dubbing Channels

  • Azure TTS, ByteDance Volcano Voice Synthesis, Elevenlabs.io, OpenAI-TTS.

Speech Recognition Channels

  • For Chinese videos, first choice: ByteDance Volcano Subtitle Generation.
  • For videos in other languages, recommended options include faster-whisper or openai-whisper (local) and Deepgram.com.

Best Combinations Without Using a VPN

  • Translation Channels: Domestic AI (e.g., DeepSeek, Zhipu AI), Microsoft Translator.
  • Dubbing Channels: Azure TTS, edge-TTS, GPT-SoVITS, F5-TTS, CosyVoice, QwenTTS.
  • Speech Recognition: faster-whisper or openai-whisper (local), select the "large-v2" or "large-v3-turbo" model, choose "whole recognition" for speech segmentation mode, and check "Chinese re-segmentation."

Best Combinations Without Restrictions on Payment/VPN

  • Translation Channels: OpenAI ChatGPT latest series models, Gemini AI, DeepSeek, Google Translate, Microsoft Translator.
  • Dubbing Channels: Azure TTS/edge-TTS, ByteDance Volcano Voice Synthesis, Elevenlabs.io, OpenAI-TTS, GPT-SoVITS, F5-TTS, CosyVoice, QwenTTS.
  • Speech Recognition: faster-whisper or openai-whisper (local)/ByteDance Volcano Subtitle Generation/Ali FunASR.

Easiest and Simplest Combination (No Proxy or Configuration Required)

  • Translation Channels: Microsoft Translator (if VPN is available and used, Google Translate can be selected).
  • Dubbing Channels: edge-TTS.
  • Speech Recognition: faster-whisper (local).

Best Speech Recognition Channels for Chinese-Spoken Videos

  • ByteDance Volcano Subtitle Generation
  • Ali FunASR
  • faster-whisper (local, large-v2/large-v3-turbo model)
  • openai-whisper (local, large-v2/large-v3-turbo model)

Best Speech Recognition Channels for Other Language-Spoken Videos

  • Gemini Large Model Recognition
  • faster-whisper
  • openai-whisper (local, large-v2/large-v3-turbo model)

Best Performing Translation Channels

  1. OpenAI ChatGPT latest series models / Gemini
  2. Domestic AI translation
  3. Google / DeepL
  4. Microsoft Translator / Tencent Translator / Baidu Translator