A free web service based on OpenAI Whisper for transcribing speech to text. Simply open your browser to use—no registration or login needed.
The model downloads and runs locally, ensuring your files are never uploaded to any external server.
Access URL
Available Models
The tool offers multiple model options, including:
tinybasesmallmediumlarge-v1large-v3
Model Features:
- Smaller models (e.g.,
tiny,base) run faster but have lower transcription accuracy. - Larger models (e.g.,
large-v1,large-v3) provide higher accuracy but run slower and may crash the browser on low-performance devices.
How to Use
- Upload File: Click to select the audio or video file you want to transcribe.
- Choose Model: Select a suitable model based on your device's performance.
- Use
tinyorbasefor weaker devices. - Choose
smallormediumfor stronger devices. - Avoid large models unless your device is high-performance to prevent browser crashes.
- Use
- Select Language: Specify the spoken language in your audio or video.
- Model Download: On first use of a model, it will be downloaded from Hugging Face. Since the site may not be directly accessible in some regions, using a VPN is recommended for smooth downloads.
Important Notes
- Privacy & Security: The model runs entirely locally after download; your files are never uploaded to any server.
- Performance Dependency: Model selection and speed depend on your device's performance.
- System Recommendations: Use Chrome on Windows or Linux for best results. Support for M-series chips on Mac devices may be limited.
Technical Details
- Implementation: Built with Transformers.js, enabling large models to run directly in the browser.
- Model Source: Uses OpenAI Whisper models, optimized and converted via Xenova/whisper-web.
