Skip to content

Speaker Recognition/Diarization

Starting from version v3.85, speaker recognition is supported.

Note: Due to limitations in current model performance, speaker recognition may not be entirely accurate.

To avoid excessive expansion of the software size, the speaker recognition model is not built-in. If you need this feature, please download the model manually, unzip it, and copy the .onnx and .txt files into the models/onnx folder within the software directory.

GitHub Download Link

https://github.com/jianchang512/stt/releases/download/0.0/noise-uvr-speaker-realtime.7z

Baidu Netdisk Download Link

https://pan.baidu.com/s/1UaI0BCXeRwditx-pIy_e9A?pwd=1234

Then, simply check the Speaker Recognition checkbox in the software interface. The number next to the checkbox determines the number of people to identify. The default is unlimited. If you know the number of speakers, it is recommended to select a specific number (2-10) to improve recognition accuracy.