CosyVoice2-TTS Windows One-Click All-in-One Package: AI Voice Synthesis Made Easy, Even for Beginners
- Download Link 1: Download from Baidu Netdisk
- Download Link 2: Download from HuggingFace.co
Are you amazed by Alibaba's open-source CosyVoice2
AI voice synthesis technology, but deterred by the complex and often error-prone installation process?
Don't worry, this one-click all-in-one package is tailored for you!
With it, you don't need to install Python or struggle with various complex errors. With just a few simple steps on Windows 10 or Windows 11, you can easily experience cutting-edge AI voice synthesis technology.
Quickly Understand the Power of CosyVoice2
CosyVoice2 is a very powerful multilingual voice synthesis model that can generate extremely accurate, stable, and natural-sounding speech.
- Supports multiple languages: Including Chinese, English, Japanese, Korean, and even Cantonese, Sichuanese, Shanghainese, and other Chinese dialects.
- Cross-lingual voice cloning: You can use a Chinese voice to speak fluent English, and vice versa.
- Ultra-low latency: Extremely fast response speed, you can hear the generated sound in as little as 150 milliseconds.
- More accurate pronunciation: Compared to the previous generation, the error rate is reduced by 30%-50%, and the pronunciation is very standard.
- Super stable timbre: No matter how you use it, you can maintain the consistency and stability of the sound.
- Emotion and accent control: Supports finer emotion control and accent adjustment, making the sound more expressive.
🚀 Just Three Steps to Start Your AI Voice Journey
Step 1: Download the All-in-One Package
First, you need to download the all-in-one package file named cosyvoice2-win.7z
. We provide two download channels, you can choose one with a faster speed to download:
- Download Link 1: Download from Baidu Netdisk
- Download Link 2: Download from HuggingFace.co
Special Reminder:
This is a
.7z
format compressed package. If your computer cannot open it directly, or an error message appears during decompression, it is recommended to install free and powerful decompression software such as 360 Compression or Bandizip and try again.
Step 2: Extract the Files
After the download is complete, find this compressed package. Right-click it and select "Extract to current folder" or "Extract to cosyvoice2". After decompression, you will get a new folder with the same name.
Step 3: Double-Click to Start!
Open the folder you just extracted and find a file named 双击启动.bat
(Double-click to Start.bat).
Double-click it directly, and the program will start running!
What happens after double-clicking?
At this time, a black window will pop up (we call it the "command prompt"). Please do not close this window, the program is processing everything for you in the background:
- Automatically download model files: The program will first check whether the AI model files required for running (about a few G) are complete. If it finds that files are missing, it will automatically start downloading. You will see the download progress in the window. This process takes a long time, depending on your network speed, please be patient.
Network tip: If the download fails halfway and you want to download again, please first enter the
pretrained_models
folder, delete the incomplete model folder inside, and then run "Double-click to Start.bat" again.
Start the core service: After the model is ready, the program will automatically start the WebUI service. This is the operation interface you use for voice synthesis.
See the success prompt: Please continue to wait until you see information similar to the following in the black window, which means you have succeeded!
Running on local URL: http://127.0.0.1:8000 To create a public link, set `share=True` in `launch()`.
It means that CosyVoice2 has been successfully run on your computer!
💻 Start Your AI Voice Creation
Please keep the black window open, then open your browser (Chrome or Edge is recommended), and enter in the top address bar:
http://127.0.0.1:8000
Press Enter, and you will see a simple and powerful operation interface. Now, you can explore, enter text, upload sound samples, and generate unique AI voices!
How to close the program?
It's very simple, when you're done using it, just close the black window that has been open.
🔧 Advanced Gameplay: Switch Between Different Sound Models
This all-in-one package has multiple models built in, each with different characteristics. The default startup is the most comprehensive CosyVoice2-0.5B
model. If you have special needs, you can switch manually.
CosyVoice-300M-SFT
: You must use this when you want to use various built-in preset voices.CosyVoice-300M-Instruct
: You must use this when you want to control the sound through text descriptions (such as "speak in a gentle tone").CosyVoice2-0.5B
: The latest and strongest model, with the best overall effect (default option).CosyVoice-300M
: A basic model.
Switching method:
Find the
Double-click to Start.bat
file in the folder, right-click it and select "Edit". (If you don’t see "Edit", select "Open with" -> "Notepad")You will see the following lines of code:
batchcall %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice2-0.5B rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M-Instruct rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M-SFT
rem
here means "comment", which means that this line of code is temporarily inactive.- To disable the current model: Add
rem
(there is a space after rem) at the beginning of that line of code. - To enable the target model: Delete
rem
at the beginning of the target model's line of code.
- To disable the current model: Add
After the modification is completed, save and close Notepad, and then re-"Double-click to Start.bat" (you must close the already started bat first).
For example, to switch to the CosyVoice-300M-SFT
model, you need to modify it like this:
rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice2-0.5B
rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M
rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M-Instruct
call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M-SFT
❓ Frequently Asked Questions
- What should I do if the program crashes after starting, or the black window reports an error
ValueError: When localhost is not accessible...
?
Solution: This is usually because your computer has network proxy or VPN software enabled (such as some accelerators). They occupy the local network ports that the program needs to use.
Please close your VPN or network proxy software and then double-click to start the program again.
- Double-clicking
run-api.bat
to run the API reports the errorCosyVoice.__init__() got an unexpected keyword argument 'load_onnx'
?
Solution: Open the api.py
file (editor or notepad), search for load_jit=True, load_onnx=False
and delete it after finding it, search for load_jit=True, load_onnx=False, load_trt=False
and delete it after finding it. There are 2 places with this code.
For Advanced Users: API Integration
The all-in-one package also includes a run-api.bat
file. If you are a developer and want to integrate CosyVoice2's voice synthesis capabilities into other programs (such as pyVideoTrans
), you can double-click this file to start the API service.