Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

使用 AI 模型

Agent 可以调用来自任何 provider 的 AI 模型。Workers AI 是内置的,无需 API 密钥。你也可以使用 OpenAI ↗Anthropic ↗Google Gemini ↗,或任何暴露 OpenAI 兼容 API 的服务。

AI SDK ↗ 提供了跨所有这些 provider 的统一接口,这也是 AIChatAgent 和 starter 模板背后使用的内容。你也可以使用 AI Gateway 中的模型路由功能跨 provider 路由、评估响应和管理速率限制。

调用 AI 模型

你可以从 Agent 内的任何方法调用模型,包括从使用 onRequest handler 的 HTTP 请求中、计划任务运行时、在 onMessage handler 中处理 WebSocket 消息时,或从你自己的任何方法中。

Agent 可以自主地调用 AI 模型 — 并可以处理需要数分钟(或更长时间)才能完整响应的长时间运行响应。如果客户端在流中途断开连接,Agent 会保持运行,并能够在客户端重新连接时让其赶上进度。

通过 WebSockets 流式传输

现代推理模型可能需要一些时间来生成响应_并_将响应流式传回客户端。你可以通过 WebSockets 流式传回,而不是缓冲整个响应。

src/index.js


import { Agent } from "agents";

import { streamText } from "ai";

import { createWorkersAI } from "workers-ai-provider";


export class MyAgent extends Agent {

  async onConnect(connection, ctx) {

    //

  }


  async onMessage(connection, message) {

    let msg = JSON.parse(message);

    await this.queryReasoningModel(connection, msg.prompt);

  }


  async queryReasoningModel(connection, userPrompt) {

    try {

      const workersai = createWorkersAI({ binding: this.env.AI });

      const result = streamText({

        model: workersai("@cf/zai-org/glm-4.7-flash"),

        prompt: userPrompt,

      });


      for await (const chunk of result.textStream) {

        if (chunk) {

          connection.send(JSON.stringify({ type: "chunk", content: chunk }));

        }

      }


      connection.send(JSON.stringify({ type: "done" }));

    } catch (error) {

      connection.send(JSON.stringify({ type: "error", error: error }));

    }

  }

}


Explain Code

src/index.ts


import { Agent } from "agents";

import { streamText } from "ai";

import { createWorkersAI } from "workers-ai-provider";


interface Env {

  AI: Ai;

}


export class MyAgent extends Agent<Env> {

  async onConnect(connection: Connection, ctx: ConnectionContext) {

    //

  }


  async onMessage(connection: Connection, message: WSMessage) {

    let msg = JSON.parse(message);

    await this.queryReasoningModel(connection, msg.prompt);

  }


  async queryReasoningModel(connection: Connection, userPrompt: string) {

    try {

      const workersai = createWorkersAI({ binding: this.env.AI });

      const result = streamText({

        model: workersai("@cf/zai-org/glm-4.7-flash"),

        prompt: userPrompt,

      });


      for await (const chunk of result.textStream) {

        if (chunk) {

          connection.send(JSON.stringify({ type: "chunk", content: chunk }));

        }

      }


      connection.send(JSON.stringify({ type: "done" }));

    } catch (error) {

      connection.send(JSON.stringify({ type: "error", error: error }));

    }

  }

}


Explain Code

你也可以使用 this.setState 将 AI 模型响应持久化回 Agent state。如果用户断开连接,读取消息历史并在他们重新连接时发送给用户。

Workers AI

你可以通过配置 binding 在你的 Agent 中使用 Workers AI 中可用的任何模型。无需 API 密钥。

Workers AI 通过设置 stream: true 支持流式响应。使用流式传输可以避免缓冲和延迟响应,尤其是对于较大的模型或推理模型。

src/index.js


import { Agent } from "agents";


export class MyAgent extends Agent {

  async onRequest(request) {

    const stream = await this.env.AI.run(

      "@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",

      {

        prompt: "Build me a Cloudflare Worker that returns JSON.",

        stream: true,

      },

    );


    return new Response(stream, {

      headers: { "content-type": "text/event-stream" },

    });

  }

}


Explain Code

src/index.ts


import { Agent } from "agents";


interface Env {

  AI: Ai;

}


export class MyAgent extends Agent<Env> {

  async onRequest(request: Request) {

    const stream = await this.env.AI.run(

      "@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",

      {

        prompt: "Build me a Cloudflare Worker that returns JSON.",

        stream: true,

      },

    );


    return new Response(stream, {

      headers: { "content-type": "text/event-stream" },

    });

  }

}


Explain Code

你的 Wrangler 配置需要 ai binding:

JSONC


{

  "ai": {

    "binding": "AI",

  },

}


TOML


[ai]

binding = "AI"


模型路由

你可以通过在调用 AI binding 时指定 gateway 配置,直接从 Agent 中使用 AI Gateway。模型路由让你基于可用性、速率限制或成本预算跨 provider 路由请求。

src/index.js


import { Agent } from "agents";


export class MyAgent extends Agent {

  async onRequest(request) {

    const response = await this.env.AI.run(

      "@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",

      {

        prompt: "Build me a Cloudflare Worker that returns JSON.",

      },

      {

        gateway: {

          id: "{gateway_id}",

          skipCache: false,

          cacheTtl: 3360,

        },

      },

    );


    return Response.json(response);

  }

}


Explain Code

src/index.ts


import { Agent } from "agents";


interface Env {

  AI: Ai;

}


export class MyAgent extends Agent<Env> {

  async onRequest(request: Request) {

    const response = await this.env.AI.run(

      "@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",

      {

        prompt: "Build me a Cloudflare Worker that returns JSON.",

      },

      {

        gateway: {

          id: "{gateway_id}",

          skipCache: false,

          cacheTtl: 3360,

        },

      },

    );


    return Response.json(response);

  }

}


Explain Code

Wrangler 配置中的 ai binding 在 Workers AI 和 AI Gateway 之间共享。

JSONC


{

  "ai": {

    "binding": "AI",

  },

}


TOML


[ai]

binding = "AI"


访问 AI Gateway 文档 了解如何配置 gateway 和获取 gateway ID。

AI SDK

AI SDK ↗ 为文本生成、工具调用、结构化响应等提供统一的 API。它适用于任何带有 AI SDK 适配器的 provider,包括通过 workers-ai-provider ↗ 的 Workers AI。

npm yarn pnpm bun

npm i ai workers-ai-provider
yarn add ai workers-ai-provider
pnpm add ai workers-ai-provider
bun add ai workers-ai-provider

src/index.js


import { Agent } from "agents";

import { generateText } from "ai";

import { createWorkersAI } from "workers-ai-provider";


export class MyAgent extends Agent {

  async onRequest(request) {

    const workersai = createWorkersAI({ binding: this.env.AI });

    const { text } = await generateText({

      model: workersai("@cf/zai-org/glm-4.7-flash"),

      prompt: "Build me an AI agent on Cloudflare Workers",

    });


    return Response.json({ modelResponse: text });

  }

}


Explain Code

src/index.ts


import { Agent } from "agents";

import { generateText } from "ai";

import { createWorkersAI } from "workers-ai-provider";


interface Env {

  AI: Ai;

}


export class MyAgent extends Agent<Env> {

  async onRequest(request: Request): Promise<Response> {

    const workersai = createWorkersAI({ binding: this.env.AI });

    const { text } = await generateText({

      model: workersai("@cf/zai-org/glm-4.7-flash"),

      prompt: "Build me an AI agent on Cloudflare Workers",

    });


    return Response.json({ modelResponse: text });

  }

}


Explain Code

你可以更换 provider 以使用 OpenAI、Anthropic 或任何其他 AI SDK 兼容的适配器:

npm yarn pnpm bun

npm i ai @ai-sdk/openai
yarn add ai @ai-sdk/openai
pnpm add ai @ai-sdk/openai
bun add ai @ai-sdk/openai

src/index.js


import { Agent } from "agents";

import { generateText } from "ai";

import { openai } from "@ai-sdk/openai";


export class MyAgent extends Agent {

  async onRequest(request) {

    const { text } = await generateText({

      model: openai("gpt-4o"),

      prompt: "Build me an AI agent on Cloudflare Workers",

    });


    return Response.json({ modelResponse: text });

  }

}


Explain Code

src/index.ts


import { Agent } from "agents";

import { generateText } from "ai";

import { openai } from "@ai-sdk/openai";


export class MyAgent extends Agent {

  async onRequest(request: Request): Promise<Response> {

    const { text } = await generateText({

      model: openai("gpt-4o"),

      prompt: "Build me an AI agent on Cloudflare Workers",

    });


    return Response.json({ modelResponse: text });

  }

}


Explain Code

OpenAI 兼容端点

Agent 可以调用任何支持 OpenAI API 的服务上的模型。例如,你可以使用 OpenAI SDK 直接从你的 Agent 中调用 Google 的 Gemini 模型 ↗

Agent 可以使用 Server-Sent Events (SSE) 在 onRequest handler 中通过 HTTP 流式传回响应,或使用原生 WebSocket API 将响应流式传回客户端。

src/index.js


import { Agent } from "agents";

import { OpenAI } from "openai";


export class MyAgent extends Agent {

  async onRequest(request) {

    const client = new OpenAI({

      apiKey: this.env.GEMINI_API_KEY,

      baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/",

    });


    let { readable, writable } = new TransformStream();

    let writer = writable.getWriter();

    const textEncoder = new TextEncoder();


    this.ctx.waitUntil(

      (async () => {

        const stream = await client.chat.completions.create({

          model: "gemini-2.0-flash",

          messages: [

            { role: "user", content: "Write me a Cloudflare Worker." },

          ],

          stream: true,

        });


        for await (const part of stream) {

          writer.write(

            textEncoder.encode(part.choices[0]?.delta?.content || ""),

          );

        }

        writer.close();

      })(),

    );


    return new Response(readable);

  }

}


Explain Code

src/index.ts


import { Agent } from "agents";

import { OpenAI } from "openai";


export class MyAgent extends Agent {

  async onRequest(request: Request): Promise<Response> {

    const client = new OpenAI({

      apiKey: this.env.GEMINI_API_KEY,

      baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/",

    });


    let { readable, writable } = new TransformStream();

    let writer = writable.getWriter();

    const textEncoder = new TextEncoder();


    this.ctx.waitUntil(

      (async () => {

        const stream = await client.chat.completions.create({

          model: "gemini-2.0-flash",

          messages: [

            { role: "user", content: "Write me a Cloudflare Worker." },

          ],

          stream: true,

        });


        for await (const part of stream) {

          writer.write(

            textEncoder.encode(part.choices[0]?.delta?.content || ""),

          );

        }

        writer.close();

      })(),

    );


    return new Response(readable);

  }

}


Explain Code