Want to experience the charm of large models but suffer from insufficient local computer performance? Usually, we use tools like ollama
locally to deploy models, but limited by computer resources, we can often only run smaller models such as 1.5b (1.5 billion), 7b (7 billion), and 14b (14 billion). Deploying large models with 70 billion parameters is a huge challenge for local hardware.
Now, you can use Cloudflare's Workers AI to deploy large models like 70b online and access them via the external network. Its interface is compatible with OpenAI, which means you can use it like OpenAI's API. The only drawback is that the daily free quota is limited, and fees will be charged for exceeding it. If you are interested, you might want to try it!
Preparation: Log in to Cloudflare and Bind a Domain
If you don't have your own domain yet, Cloudflare will provide a free account domain. But please note that this free domain may not be directly accessible in China, and you may need to use some "magic" to access it.
First, open the Cloudflare official website (https://dash.cloudflare.com) and log in to your account.
Step 1: Create Workers AI
Find Workers AI: In the left navigation bar of the Cloudflare console, find "AI" -> "Workers AI", and then click "Create from Worker template".
Create Worker: Then click "Create Worker".
Fill in Worker Name: Enter a string composed of English letters, which will be used as the default account domain for your Worker.
- Deploy: Click the "Deploy" button in the lower right corner to complete the creation of the Worker.
Step 2: Modify the Code and Deploy the Llama 3.3 70b Large Model
Enter Code Editing: After deployment, you will see the interface shown in the figure below. Click "Edit code".
Clear Code: Delete all preset code in the editor.
Paste Code: Copy and paste the following code into the code editor:
Here we are using the
llama-3.3-70b-instruct-fp8-fast
model, which has 70 billion parameters.You can also find other models for replacement on the Cloudflare model page, such as the Deepseek open source model. But currently,
llama-3.3-70b-instruct-fp8-fast
is one of the largest and most effective models.javascriptconst API_KEY='123456'; export default { async fetch(request, env) { let url = new URL(request.url); const path = url.pathname; const authHeader = request.headers.get("authorization") || request.headers.get("x-api-key"); const apiKey = authHeader?.startsWith("Bearer ") ? authHeader.slice(7) : null; if (API_KEY && apiKey !== API_KEY) { return new Response(JSON.stringify({ error: { message: "Invalid API key. Use 'Authorization: Bearer your-api-key' header", type: "invalid_request_error", param: null, code: "invalid_api_key" } }), { status: 401, headers: { "Content-Type": "application/json", } }); } if (path === "/v1/chat/completions") { const requestBody = await request.json(); // messages - chat style input const {message}=requestBody let chat = { messages: message }; let response = await env.AI.run('@cf/meta/llama-3.3-70b-instruct-fp8-fast', requestBody); let resdata={ choices:[{"message":{"content":response.response}}] } return Response.json(resdata); } } };
Deploy Code: After pasting the code, click the "Deploy" button.
Step 3: Bind a Custom Domain
- Return to Settings: Click the return button on the left to return to the Worker's management page, find "Settings" -> "Domains & Routes".
- Add Custom Domain: Click "Add Domain", then select "Custom Domain" and enter the subdomain you have already bound to Cloudflare.
Step 4: Use in Tools Compatible with OpenAI
After adding a custom domain, you can use this large model in any tool compatible with the OpenAI API.
- API Key: The
API_KEY
you set in the code, which defaults to123456
. - API Address:
https://your-custom-domain/v1
Thanks to Cloudflare's powerful GPU resources, it is very smooth to use.
Precautions
- Free Quota: Cloudflare Workers AI provides 10k free tokens per day, and fees will be charged for exceeding this amount.
- Fee Details: You can view detailed fee information on the Cloudflare official pricing page (https://developers.cloudflare.com/workers-ai/platform/pricing/).