使用 AI 模型
Agent 可以调用来自任何 provider 的 AI 模型。Workers AI 是内置的,无需 API 密钥。你也可以使用 OpenAI ↗、Anthropic ↗、Google Gemini ↗,或任何暴露 OpenAI 兼容 API 的服务。
AI SDK ↗ 提供了跨所有这些 provider 的统一接口,这也是 AIChatAgent 和 starter 模板背后使用的内容。你也可以使用 AI Gateway 中的模型路由功能跨 provider 路由、评估响应和管理速率限制。
调用 AI 模型
你可以从 Agent 内的任何方法调用模型,包括从使用 onRequest handler 的 HTTP 请求中、计划任务运行时、在 onMessage handler 中处理 WebSocket 消息时,或从你自己的任何方法中。
Agent 可以自主地调用 AI 模型 — 并可以处理需要数分钟(或更长时间)才能完整响应的长时间运行响应。如果客户端在流中途断开连接,Agent 会保持运行,并能够在客户端重新连接时让其赶上进度。
通过 WebSockets 流式传输
现代推理模型可能需要一些时间来生成响应_并_将响应流式传回客户端。你可以通过 WebSockets 流式传回,而不是缓冲整个响应。
src/index.js
import { Agent } from "agents";
import { streamText } from "ai";
import { createWorkersAI } from "workers-ai-provider";
export class MyAgent extends Agent {
async onConnect(connection, ctx) {
//
}
async onMessage(connection, message) {
let msg = JSON.parse(message);
await this.queryReasoningModel(connection, msg.prompt);
}
async queryReasoningModel(connection, userPrompt) {
try {
const workersai = createWorkersAI({ binding: this.env.AI });
const result = streamText({
model: workersai("@cf/zai-org/glm-4.7-flash"),
prompt: userPrompt,
});
for await (const chunk of result.textStream) {
if (chunk) {
connection.send(JSON.stringify({ type: "chunk", content: chunk }));
}
}
connection.send(JSON.stringify({ type: "done" }));
} catch (error) {
connection.send(JSON.stringify({ type: "error", error: error }));
}
}
}
Explain Code
src/index.ts
import { Agent } from "agents";
import { streamText } from "ai";
import { createWorkersAI } from "workers-ai-provider";
interface Env {
AI: Ai;
}
export class MyAgent extends Agent<Env> {
async onConnect(connection: Connection, ctx: ConnectionContext) {
//
}
async onMessage(connection: Connection, message: WSMessage) {
let msg = JSON.parse(message);
await this.queryReasoningModel(connection, msg.prompt);
}
async queryReasoningModel(connection: Connection, userPrompt: string) {
try {
const workersai = createWorkersAI({ binding: this.env.AI });
const result = streamText({
model: workersai("@cf/zai-org/glm-4.7-flash"),
prompt: userPrompt,
});
for await (const chunk of result.textStream) {
if (chunk) {
connection.send(JSON.stringify({ type: "chunk", content: chunk }));
}
}
connection.send(JSON.stringify({ type: "done" }));
} catch (error) {
connection.send(JSON.stringify({ type: "error", error: error }));
}
}
}
Explain Code
你也可以使用 this.setState 将 AI 模型响应持久化回 Agent state。如果用户断开连接,读取消息历史并在他们重新连接时发送给用户。
Workers AI
你可以通过配置 binding 在你的 Agent 中使用 Workers AI 中可用的任何模型。无需 API 密钥。
Workers AI 通过设置 stream: true 支持流式响应。使用流式传输可以避免缓冲和延迟响应,尤其是对于较大的模型或推理模型。
src/index.js
import { Agent } from "agents";
export class MyAgent extends Agent {
async onRequest(request) {
const stream = await this.env.AI.run(
"@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",
{
prompt: "Build me a Cloudflare Worker that returns JSON.",
stream: true,
},
);
return new Response(stream, {
headers: { "content-type": "text/event-stream" },
});
}
}
Explain Code
src/index.ts
import { Agent } from "agents";
interface Env {
AI: Ai;
}
export class MyAgent extends Agent<Env> {
async onRequest(request: Request) {
const stream = await this.env.AI.run(
"@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",
{
prompt: "Build me a Cloudflare Worker that returns JSON.",
stream: true,
},
);
return new Response(stream, {
headers: { "content-type": "text/event-stream" },
});
}
}
Explain Code
你的 Wrangler 配置需要 ai binding:
JSONC
{
"ai": {
"binding": "AI",
},
}
TOML
[ai]
binding = "AI"
模型路由
你可以通过在调用 AI binding 时指定 gateway 配置,直接从 Agent 中使用 AI Gateway。模型路由让你基于可用性、速率限制或成本预算跨 provider 路由请求。
src/index.js
import { Agent } from "agents";
export class MyAgent extends Agent {
async onRequest(request) {
const response = await this.env.AI.run(
"@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",
{
prompt: "Build me a Cloudflare Worker that returns JSON.",
},
{
gateway: {
id: "{gateway_id}",
skipCache: false,
cacheTtl: 3360,
},
},
);
return Response.json(response);
}
}
Explain Code
src/index.ts
import { Agent } from "agents";
interface Env {
AI: Ai;
}
export class MyAgent extends Agent<Env> {
async onRequest(request: Request) {
const response = await this.env.AI.run(
"@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",
{
prompt: "Build me a Cloudflare Worker that returns JSON.",
},
{
gateway: {
id: "{gateway_id}",
skipCache: false,
cacheTtl: 3360,
},
},
);
return Response.json(response);
}
}
Explain Code
Wrangler 配置中的 ai binding 在 Workers AI 和 AI Gateway 之间共享。
JSONC
{
"ai": {
"binding": "AI",
},
}
TOML
[ai]
binding = "AI"
访问 AI Gateway 文档 了解如何配置 gateway 和获取 gateway ID。
AI SDK
AI SDK ↗ 为文本生成、工具调用、结构化响应等提供统一的 API。它适用于任何带有 AI SDK 适配器的 provider,包括通过 workers-ai-provider ↗ 的 Workers AI。
npm yarn pnpm bun
npm i ai workers-ai-provider
yarn add ai workers-ai-provider
pnpm add ai workers-ai-provider
bun add ai workers-ai-provider
src/index.js
import { Agent } from "agents";
import { generateText } from "ai";
import { createWorkersAI } from "workers-ai-provider";
export class MyAgent extends Agent {
async onRequest(request) {
const workersai = createWorkersAI({ binding: this.env.AI });
const { text } = await generateText({
model: workersai("@cf/zai-org/glm-4.7-flash"),
prompt: "Build me an AI agent on Cloudflare Workers",
});
return Response.json({ modelResponse: text });
}
}
Explain Code
src/index.ts
import { Agent } from "agents";
import { generateText } from "ai";
import { createWorkersAI } from "workers-ai-provider";
interface Env {
AI: Ai;
}
export class MyAgent extends Agent<Env> {
async onRequest(request: Request): Promise<Response> {
const workersai = createWorkersAI({ binding: this.env.AI });
const { text } = await generateText({
model: workersai("@cf/zai-org/glm-4.7-flash"),
prompt: "Build me an AI agent on Cloudflare Workers",
});
return Response.json({ modelResponse: text });
}
}
Explain Code
你可以更换 provider 以使用 OpenAI、Anthropic 或任何其他 AI SDK 兼容的适配器:
npm yarn pnpm bun
npm i ai @ai-sdk/openai
yarn add ai @ai-sdk/openai
pnpm add ai @ai-sdk/openai
bun add ai @ai-sdk/openai
src/index.js
import { Agent } from "agents";
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";
export class MyAgent extends Agent {
async onRequest(request) {
const { text } = await generateText({
model: openai("gpt-4o"),
prompt: "Build me an AI agent on Cloudflare Workers",
});
return Response.json({ modelResponse: text });
}
}
Explain Code
src/index.ts
import { Agent } from "agents";
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";
export class MyAgent extends Agent {
async onRequest(request: Request): Promise<Response> {
const { text } = await generateText({
model: openai("gpt-4o"),
prompt: "Build me an AI agent on Cloudflare Workers",
});
return Response.json({ modelResponse: text });
}
}
Explain Code
OpenAI 兼容端点
Agent 可以调用任何支持 OpenAI API 的服务上的模型。例如,你可以使用 OpenAI SDK 直接从你的 Agent 中调用 Google 的 Gemini 模型 ↗。
Agent 可以使用 Server-Sent Events (SSE) 在 onRequest handler 中通过 HTTP 流式传回响应,或使用原生 WebSocket API 将响应流式传回客户端。
src/index.js
import { Agent } from "agents";
import { OpenAI } from "openai";
export class MyAgent extends Agent {
async onRequest(request) {
const client = new OpenAI({
apiKey: this.env.GEMINI_API_KEY,
baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/",
});
let { readable, writable } = new TransformStream();
let writer = writable.getWriter();
const textEncoder = new TextEncoder();
this.ctx.waitUntil(
(async () => {
const stream = await client.chat.completions.create({
model: "gemini-2.0-flash",
messages: [
{ role: "user", content: "Write me a Cloudflare Worker." },
],
stream: true,
});
for await (const part of stream) {
writer.write(
textEncoder.encode(part.choices[0]?.delta?.content || ""),
);
}
writer.close();
})(),
);
return new Response(readable);
}
}
Explain Code
src/index.ts
import { Agent } from "agents";
import { OpenAI } from "openai";
export class MyAgent extends Agent {
async onRequest(request: Request): Promise<Response> {
const client = new OpenAI({
apiKey: this.env.GEMINI_API_KEY,
baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/",
});
let { readable, writable } = new TransformStream();
let writer = writable.getWriter();
const textEncoder = new TextEncoder();
this.ctx.waitUntil(
(async () => {
const stream = await client.chat.completions.create({
model: "gemini-2.0-flash",
messages: [
{ role: "user", content: "Write me a Cloudflare Worker." },
],
stream: true,
});
for await (const part of stream) {
writer.write(
textEncoder.encode(part.choices[0]?.delta?.content || ""),
);
}
writer.close();
})(),
);
return new Response(readable);
}
}
Explain Code