LocalLLM Translation with Local Large Language Models

If you're technically inclined, you can also deploy large language models locally for translation. This guide introduces the deployment and usage methods, taking Tongyi Qianwen as an example.

1. Download the EXE and Run It Successfully

Visit https://ollama.com/download

Click to download. After downloading, double-click to open the installation interface and click Install to complete the installation.

After completion, a black or blue window will automatically pop up. Enter the three words ollama run qwen and press Enter to automatically download the Tongyi Qianwen model.

Wait for the model download to finish. No proxy is required, and the speed is quite fast.

After the model is automatically downloaded, it will run directly. When the progress reaches 100% and displays the "Success" character, it means that the model is running successfully. At this point, the installation and deployment of the Tongyi Qianwen large model are complete, and you can happily use it. It's super simple, isn't it?

The default interface address is http://localhost:11434

If the window is closed, how do I open it again? It's also very simple. Open the computer's Start menu and find "Command Prompt" or "Windows PowerShell" (or directly press Win + Q and type cmd to search), click to open it, and enter ollama run qwen to complete the process.

2. Use Directly in the Console Command Window

As shown in the figure below, when this interface is displayed, you can actually enter text directly in the window to start using it.

3. Of course, this interface may not be very friendly, so let's get a friendly UI.

Visit https://chatboxai.app/zh and click Download.

After downloading, double-click and wait for the interface window to open automatically.

Click "Start Setup". In the pop-up overlay, click the top Model tab, select "Ollama" in the AI Model Provider section, enter the address http://localhost:11434 in the API Domain field, select Qwen:latest in the model drop-down menu, and then save it.

This is the usage interface displayed after saving. Use your imagination and feel free to use it.

4. Fill in the API in the Video Translation and Dubbing Software

Open Menu -- Settings -- Local Compatible OpenAI Large Model. Add a model ,qwen in the middle text box. After adding, it should look like this, and then select the model.

Fill in http://localhost:11434 in the API URL. Fill in any SK, such as 1234.

Test if it is successful. If it is successful, save it and use it.

5. Call in Code

Ollama provides an OpenAI-compatible API interface. You can directly use the OpenAI library to call it. You only need to change the model name to qwen.

from openai import OpenAI

client = OpenAI(
    base_url = 'http://localhost:11434/v1',
    api_key='ollama', # required, but unused
)

response = client.chat.completions.create(
  model="qwen",
  messages=[
    {"role": "system", "content": "You are a professional multilingual translation expert."},
    {"role": "user", "content": "Translate the content I send you into English. Only return the translation, do not answer questions, do not confirm, do not reply to this message, start translating from the next line.\nThe weather is nice today!\nIt's sunny and breezy, and we have no class this afternoon.\nThis is really cool."}
  ]
)
print(response.choices[0].message.content)

The effect is pretty good!

7. What other models can I use?

In addition to Tongyi Qianwen, there are many other models that can be used. The usage method is just as simple. You only need three words: ollama run model name.

Open this address https://ollama.com/library to see all the model names. If you want to use a model, just copy the name and execute ollama run model name.

Do you remember how to open the command window? Click the Start menu and find Command Prompt or Windows PowerShell.

For example, if I want to install the openchat model:

Open Command Prompt, enter ollama run openchat, press Enter, and wait until Success is displayed.

Important Notes:

Most AI translation channels may limit the number of requests per minute. If you encounter an error message indicating that the request frequency has been exceeded, you can set a delay time through the software's Menu -- Tools/Advanced Settings -- Advanced Settings/set.ini -- Pause time after translation, such as 30, which means waiting 30 seconds after each translation before performing the next translation to avoid exceeding the limit frequency and causing errors.