Skip to content

Want to experience the power of large models but struggle with limited local computer performance? Typically, we deploy models locally using tools like ollama, but due to resource constraints, we can often only run smaller models like 1.5b (1.5 billion), 7b (7 billion), or 14b (14 billion). Deploying a 70-billion-parameter large model poses a significant challenge for local hardware.

Now, you can deploy large models like 70b online using Cloudflare's Workers AI and access them via the internet. Its interface is compatible with OpenAI, meaning you can use it just like OpenAI's API. The only drawback is the limited daily free quota, with costs incurred for overages. If you're interested, give it a try!

Preparation: Log in to Cloudflare and Bind a Domain

If you don't have your own domain, Cloudflare provides a free account domain. However, note that this free domain may not be directly accessible in some regions, and you might need to use a workaround to access it.

First, go to the Cloudflare website (https://dash.cloudflare.com) and log in to your account.

Step 1: Create Workers AI

  1. Find Workers AI: In the Cloudflare dashboard's left navigation bar, locate "AI" -> "Workers AI," then click "Create from Worker Template."

    image.png

  2. Create Worker: Next, click "Create Worker."

    image.png

  3. Enter Worker Name: Input a string of English letters, which will serve as your Worker's default account domain.

    image.png

    image.png

  4. Deploy: Click the "Deploy" button in the bottom right to complete the Worker creation.

Step 2: Modify Code to Deploy Llama 3.3 70b Large Model

  1. Enter Code Editor: After deployment, you'll see the following interface. Click "Edit Code."

    image.png

  2. Clear Code: Delete all the preset code in the editor.

    image.png

  3. Paste Code: Copy and paste the following code into the code editor:

    Here, we're using the llama-3.3-70b-instruct-fp8-fast model, which has 70 billion parameters.

    You can also find other models to replace it on the Cloudflare Models Page, such as Deepseek open-source models. Currently, llama-3.3-70b-instruct-fp8-fast is one of the largest and most effective models available.

    image.png

    javascript
    const API_KEY='123456';
    export default {
      async fetch(request, env) {
    
        let url = new URL(request.url);
        const path = url.pathname;
    
        const authHeader = request.headers.get("authorization") || request.headers.get("x-api-key");
        const apiKey = authHeader?.startsWith("Bearer ")  ? authHeader.slice(7)  : null;
                             
        if (API_KEY && apiKey !== API_KEY) {
    
          return new Response(JSON.stringify({
            error: {
                message: "Invalid API key. Use 'Authorization: Bearer your-api-key' header",
                type: "invalid_request_error",
                param: null,
                code: "invalid_api_key"
            }
          }), {
              status: 401,
              headers: {
                  "Content-Type": "application/json",
              }
          });
        }
    
        if (path === "/v1/chat/completions") {
          const requestBody = await request.json();
           // messages - chat style input
     	 const {message}=requestBody
     	 let chat = {
     	   messages: message
     	 };
          let response = await env.AI.run('@cf/meta/llama-3.3-70b-instruct-fp8-fast', requestBody);
         
          let resdata={
            choices:[{"message":{"content":response.response}}]
          }    
          return Response.json(resdata);
        }  
        
      }
    };
  4. Deploy Code: After pasting the code, click the "Deploy" button.

    image.png

Step 3: Bind a Custom Domain

  1. Return to Settings: Click the back button on the left to return to the Worker management page, then go to "Settings" -> "Domains and Routes."

    image.png

  2. Add Custom Domain: Click "Add Domain," then select "Custom Domain" and enter the subdomain you've already bound to Cloudflare.

    image.png

Step 4: Use in OpenAI-Compatible Tools

After adding a custom domain, you can use this large model in any tool that supports the OpenAI API.

  • API Key: This is the API_KEY set in your code, defaulting to 123456.
  • API Endpoint: https://your-custom-domain/v1

Thanks to Cloudflare's powerful GPU resources, the usage will be very smooth.

Important Notes

image.png