Skip to content

Want to experience the power of large language models but suffer from insufficient local computer performance? Usually, we deploy models locally using tools like ollama, but limited by computer resources, we can often only run smaller models like 1.5b (1.5 billion), 7b (7 billion), and 14b (14 billion). Deploying a 70 billion parameter large model is a huge challenge for local hardware.

Now, you can deploy large models like 70b online with Cloudflare's Workers AI and access them via the public network. Its interface is OpenAI-compatible, which means you can use it like using the OpenAI API. The only downside is that the daily free tier is limited, and there will be charges for exceeding it. If you're interested, give it a try!

Preparation: Log in to Cloudflare and Bind a Domain Name

If you don't have your own domain name yet, Cloudflare will provide a free account domain. However, please note that this free domain may not be directly accessible in some regions.

First, open the Cloudflare website (https://dash.cloudflare.com) and log in to your account.

Step 1: Create Workers AI

  1. Find Workers AI: In the left navigation bar of the Cloudflare console, find "AI" -> "Workers AI", and then click "Create from Worker template".

    image.png

  2. Create Worker: Then click "Create Worker".

    image.png

  3. Fill in the Worker Name: Enter a string of English letters, which will be used as the default account domain name for your Worker.

image.png

image.png

  1. Deploy: Click the "Deploy" button in the lower right corner to complete the creation of the Worker.

Step 2: Modify the Code and Deploy the Llama 3.3 70b Large Model

  1. Enter Code Editing: After deployment, you will see the interface shown below. Click "Edit code".

    image.png

  2. Clear the Code: Delete all the preset code in the editor.

    image.png

  3. Paste the Code: Copy and paste the following code into the code editor:

    Here we are using the llama-3.3-70b-instruct-fp8-fast model, which has 70 billion parameters.

    You can also find other models on the Cloudflare Models page to replace it, such as Deepseek open-source models. But currently, llama-3.3-70b-instruct-fp8-fast is one of the largest and most effective models. image.png

    javascript
    const API_KEY='123456';
    export default {
      async fetch(request, env) {
    
        let url = new URL(request.url);
        const path = url.pathname;
    
        const authHeader = request.headers.get("authorization") || request.headers.get("x-api-key");
        const apiKey = authHeader?.startsWith("Bearer ")  ? authHeader.slice(7)  : null;
                            
        if (API_KEY && apiKey !== API_KEY) {
    
          return new Response(JSON.stringify({
            error: {
                message: "Invalid API key. Use 'Authorization: Bearer your-api-key' header",
                type: "invalid_request_error",
                param: null,
                code: "invalid_api_key"
            }
          }), {
              status: 401,
              headers: {
                  "Content-Type": "application/json",
              }
          });
        }
    
        if (path === "/v1/chat/completions") {
          const requestBody = await request.json();
           // messages - chat style input
    	  const {message}=requestBody
    	  let chat = {
    		messages: message
    	  };
          let response = await env.AI.run('@cf/meta/llama-3.3-70b-instruct-fp8-fast', requestBody);
        
          let resdata={
            choices:[{"message":{"content":response.response}}]
          }    
          return Response.json(resdata);
        }  
       
      }
    };
  4. Deploy the Code: After pasting the code, click the "Deploy" button.

image.png

Step 3: Bind a Custom Domain Name

  1. Return to Settings: Click the return button on the left to return to the Worker management page, find "Settings" -> "Domains & Routes".

image.png

  1. Add a Custom Domain: Click "Add a Domain", then select "Custom Domain" and enter the subdomain you have already bound to Cloudflare.

image.png

Step 4: Use in OpenAI-Compatible Tools

After adding a custom domain, you can use this large model in any OpenAI API-compatible tool.

  • API Key: The API_KEY you set in the code, which is 123456 by default.
  • API Address: https://your-custom-domain/v1

Thanks to Cloudflare's powerful GPU resources, it's very smooth to use.

Precautions

image.png