Skip to content

In audio-to-text tasks, background noise or accompaniment can affect recognition accuracy. Removing background music in advance can yield more precise results.

1. vocal-separate: A local, offline vocal isolation tool based on spleeter. It offers a pre-packaged version for Windows, ready to use after unzipping. Mac/Linux users need to deploy from the source code. It has a Chinese interface, is very easy to use, supports direct video processing, and is relatively fast.

2. Ultimate Vocal Remover: This is the desktop GUI version of uvr5. On Windows, it needs to be installed on the C drive, otherwise problems may occur. It has an English interface with many options, making it relatively complex to operate, but it is also more powerful and produces better results.


vocal-separate Installation and Usage

1. For Windows, first download the pre-packaged version here; for other systems, pull the source code for deployment. https://github.com/jianchang512/vocal-separate/releases

2. After downloading, unzip and double-click start.exe. If you see a similar error as shown below, don't worry, this only indicates that GPU acceleration is not available, which does not affect usage.

After successful startup, the following browser page will open:

3. As shown above, drag and drop or click to upload the audio or video you want to isolate the vocals from. Videos will be automatically converted to audio before processing.

Select "2stems" from the models to separate the uploaded file into two files: vocals and other sounds.

Of course, you can also choose the 4stems and 5stems models, which, in addition to separating the vocals, will further divide the other sounds into files such as "drums" and "bass." In general, only 2stems is needed.

You can listen to the separation results on the webpage, click download, or directly go to the displayed separation results directory to find the separated files. The vocal file name is vocals.wav, and the other sound file name is accompaniment.wav.

It's that simple.


Ultimate Vocal Remover Installation and Usage

1. First, go here https://github.com/Anjok07/ultimatevocalremovergui/releases/tag/v5.6 to download.

You can also click this link to directly download the Windows version. After downloading, double-click the exe file and click next to complete the installation. https://github.com/Anjok07/ultimatevocalremovergui/releases/download/v5.6/UVR_v5.6.0_setup.exe

2. After the installation is complete, double-click the desktop icon to launch the program.

3. As shown below, select the audio file you want to process, set the output directory, and choose the processing model, bit rate, and other options. Except for "Select Input" and "Select Output," all other options are non-essential and can be left at their defaults.

"Select Input": Click to select the audio file you want to process.

"Select Output": Click to choose where to save the processed file.

"CHOOSE PROCESS MEHTODS": Select the processing method. The default is MDX-Net, which should provide the best results. Keep it at the default.

"CHOOSE MDX-NET MODEL": The model to use corresponding to the above method. If it is not the "MDX-Net" method, you need to download the model separately.

"Start Processing": The start button to execute the separation operation after the selection is complete. Click it to start the separation operation and wait for the completion prompt.