Parakeet-API: High-Performance Local Speech Transcription Service

Parakeet-API is a local speech transcription service based on the NVIDIA Parakeet-tdt-0.6b model. It offers an OpenAI API-compatible interface and a clean web user interface, enabling you to effortlessly and rapidly transcribe any audio or video file into accurate SRT subtitles, with support for pyVideoTrans v3.72+.

Project open-source repository: https://github.com/jianchang512/parakeet-api

✨ Parakeet-API Core Advantages

🚀 Ultimate Speed and Performance: The Parakeet model is highly optimized, delivering exceptionally fast transcription speeds, especially on systems with NVIDIA GPUs, making it ideal for processing large volumes or long-duration media files.
🎯 Precise Timestamps: Leveraging advanced Transducer (TDT) technology, it generates SRT subtitles with highly accurate timestamps that align perfectly with the audio stream, perfect for video subtitle creation.
💰 Completely Free, Unlimited Usage: Run it on your own hardware with no API call fees or usage limits.
🌐 Flexible Access Methods: Provides an intuitive web interface and standardized API endpoints for easy integration into existing workflows like pyVideoTrans.

🛠️ Installation and Configuration Guide

This project supports Windows, macOS, and Linux. Follow the steps below for installation and setup.

Step 0: Set Up Python 3.10 Environment

If you don't have Python 3 installed, follow this tutorial: https://pvt9.com/_posts/pythoninstall

Step 1: Prepare FFmpeg

This project uses ffmpeg for audio and video format preprocessing.

Windows (Recommended):
1. Download from FFmpeg GitHub Releases, extract, and locate ffmpeg.exe.
2. Place the downloaded ffmpeg.exe file directly in the project root directory (same level as app.py). The program will automatically detect and use it without needing to set environment variables.
macOS (Using Homebrew):
bash
```
brew install ffmpeg
```
1

Linux (Debian/Ubuntu):

bash

sudo apt update && sudo apt install ffmpeg

Step 2: Create Python Virtual Environment and Install Dependencies

Download or clone this project's code to your local machine (recommended to place in a non-system drive folder with English or numeric names).
Open a terminal or command-line tool and navigate to the project root directory (on Windows, simply type cmd in the folder address bar and press Enter).
Create a virtual environment: python -m venv venv
Activate the virtual environment:
- Windows (CMD/PowerShell): .\venv\Scripts\activate
- macOS / Linux (Bash/Zsh): source venv/bin/activate
Install dependencies:
- If you do NOT have an NVIDIA GPU (CPU only):
  bash
```
pip install -r requirements.txt
```
  1
- If you have an NVIDIA GPU (for GPU acceleration): a. Ensure you have the latest NVIDIA drivers and corresponding CUDA Toolkit installed. b. Uninstall any existing older PyTorch versions: pip uninstall -y torch c. Install PyTorch matching your CUDA version (example for CUDA 12.6):
  bash
```
pip install torch --index-url https://download.pytorch.org/whl/cu126
```
  1

Step 3: Start the Service

In the terminal with the virtual environment activated, run the following command:

bash

python app.py

You will see service startup messages. On the first run, the model (approx. 1.2GB) will be downloaded—please wait patiently.

If you see various log messages, don't worry.

Successful Startup Interface

🚀 Usage Methods

Method 1: Using the Web Interface

Open your browser and go to: http://127.0.0.1:5092
Drag and drop or click to upload your audio or video file.
Click "Start Transcription", wait for processing to complete, and then view and download the SRT subtitle below.

Method 2: API Call (Python Example)

Use the openai library to easily interact with this service.

python

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:5092/v1",
    api_key="any-key",
)

with open("your_audio.mp3", "rb") as audio_file:
    srt_result = client.audio.transcriptions.create(
        model="parakeet",
        file=audio_file,
        response_format="srt"
    )
print(srt_result)

Method 3: Integration with pyVideoTrans (Recommended)

Parakeet-API seamlessly integrates with the video translation tool pyVideoTrans (version 3.72 and above).

Ensure your parakeet-api service is running locally.
Open the pyVideoTrans software.
In the menu bar, select Speech Recognition (R) -> Nvidia parakeet-tdt.
In the configuration window that appears, set the "HTTP Address" to: http://127.0.0.1:5092/v1
Click "Save", and you're ready to use it.

Parakeet-API: High-Performance Local Speech Transcription Service ​

✨ Parakeet-API Core Advantages ​

🛠️ Installation and Configuration Guide ​

Step 0: Set Up Python 3.10 Environment ​

Step 1: Prepare FFmpeg ​

Step 2: Create Python Virtual Environment and Install Dependencies ​

Step 3: Start the Service ​

🚀 Usage Methods ​

Method 1: Using the Web Interface ​

Method 2: API Call (Python Example) ​

Method 3: Integration with pyVideoTrans (Recommended) ​