语音 agent
构建实时语音 agent,带语音转文字 (STT)、文字转语音 (TTS) 和会话持久化。音频通过 WebSocket 流式传输 — 不需要 SFU 或会议基础设施。Beta
概览
@cloudflare/voice 提供了两个服务端 mixin 和对应的客户端库:
| 导出 | 引入 | 用途 |
|---|---|---|
| withVoice | @cloudflare/voice | 完整语音 agent:STT、LLM、TTS、持久化 |
| withVoiceInput | @cloudflare/voice | 仅 STT:转录但不响应 |
| useVoiceAgent | @cloudflare/voice/react | withVoice agent 的 React hook |
| useVoiceInput | @cloudflare/voice/react | withVoiceInput agent 的 React hook |
| VoiceClient | @cloudflare/voice/client | 框架无关的客户端 |
构建在 Cloudflare Durable Objects 上,你可以获得:
- 实时音频 — 麦克风音频以二进制 WebSocket 帧流式传输,TTS 音频流式返回
- 会话自动持久化 — 消息存储在 SQLite,挺过重启
- 流式 TTS — LLM token 按句子分块,并发合成
- 打断处理 — 播放期间用户开始说话会取消当前响应
- 持续 STT — 每次通话独立的转录器会话,模型负责检测说话轮次
- 管线 hook — 在每个阶段拦截和转换文本
快速开始
安装
Terminal 窗口
npm install @cloudflare/voice agents
服务端
JavaScript
import { Agent } from "agents";
import { withVoice, WorkersAIFluxSTT, WorkersAITTS } from "@cloudflare/voice";
const VoiceAgent = withVoice(Agent);
export class MyAgent extends VoiceAgent {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
async onTurn(transcript, context) {
return "Hello! I heard you say: " + transcript;
}
}
Explain Code
TypeScript
import { Agent } from "agents";
import {
withVoice,
WorkersAIFluxSTT,
WorkersAITTS,
type VoiceTurnContext,
} from "@cloudflare/voice";
const VoiceAgent = withVoice(Agent);
export class MyAgent extends VoiceAgent<Env> {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
async onTurn(transcript: string, context: VoiceTurnContext) {
return "Hello! I heard you say: " + transcript;
}
}
Explain Code
客户端 (React)
import { useVoiceAgent } from "@cloudflare/voice/react";
function VoiceUI() {
const {
status,
transcript,
interimTranscript,
audioLevel,
isMuted,
startCall,
endCall,
toggleMute,
} = useVoiceAgent({ agent: "MyAgent" });
return (
<div>
<p>Status: {status}</p>
<button onClick={status === "idle" ? startCall : endCall}>
{status === "idle" ? "Start Call" : "End Call"}
</button>
<button onClick={toggleMute}>{isMuted ? "Unmute" : "Mute"}</button>
{interimTranscript && (
<p>
<em>{interimTranscript}</em>
</p>
)}
{transcript.map((msg, i) => (
<p key={i}>
<strong>{msg.role}:</strong> {msg.text}
</p>
))}
</div>
);
}
Explain Code
Wrangler 配置
JSONC
{
"ai": {
"binding": "AI"
},
"durable_objects": {
"bindings": [
{
"name": "MyAgent",
"class_name": "MyAgent"
}
]
},
"migrations": [
{
"tag": "v1",
"new_sqlite_classes": ["MyAgent"]
}
]
}
Explain Code
TOML
[ai]
binding = "AI"
[[durable_objects.bindings]]
name = "MyAgent"
class_name = "MyAgent"
[[migrations]]
tag = "v1"
new_sqlite_classes = [ "MyAgent" ]
Explain Code
工作原理
Browser Durable Object (withVoice)
┌──────────┐ ┌──────────────────────────┐
│ Mic │ binary PCM (16kHz) │ Transcriber session │
│ │ ──────────────────────► │ (per-call, continuous) │
│ │ │ ↓ model detects turn │
│ │ JSON: transcript │ onTurn() → your LLM code │
│ │ ◄────────────────────── │ ↓ (sentence chunking) │
│ │ binary: audio │ TTS │
│ Speaker │ ◄────────────────────── │ │
└──────────┘ └──────────────────────────┘
Explain Code
- 客户端捕获麦克风音频,以二进制 WebSocket 帧发送(16kHz 单声道 16 位 PCM)。
- 音频持续流向转录器会话(在
start_call时创建,贯穿整个通话)。 - STT 模型检测用户何时说完一段话,触发
onUtterance。所有提供商都使用模型驱动的轮次检测 — 客户端不需要为 STT 发送说话结束信号。 - 你的
onTurn()方法运行 — 通常是一次 LLM 调用。 - 响应按句子分块,通过 TTS 合成。
- 音频流回客户端播放。
客户端在用户说话时会收到 transcript_interim 消息,包含部分结果,这样你可以在 UI 上显示实时反馈。
服务端 API:withVoice
withVoice(Agent) 把完整的语音管线添加到 Agent 类中。
Providers
把 provider 设为类属性。类字段初始化在 super() 之后运行,所以 this.env 可用。
| 属性 | 类型 | 必需 | 描述 |
|---|---|---|---|
| transcriber | Transcriber | 是 | 每次通话持续的 STT provider |
| tts | TTSProvider | 是 | 文字转语音 |
JavaScript
import { withVoice, WorkersAIFluxSTT, WorkersAITTS } from "@cloudflare/voice";
const VoiceAgent = withVoice(Agent);
export class MyAgent extends VoiceAgent {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
}
TypeScript
import { withVoice, WorkersAIFluxSTT, WorkersAITTS } from "@cloudflare/voice";
const VoiceAgent = withVoice(Agent);
export class MyAgent extends VoiceAgent<Env> {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
}
如果需要在运行时切换模型(例如 Flux 和 Nova 3 之间的下拉切换),重写 createTranscriber:
JavaScript
export class MyAgent extends VoiceAgent {
tts = new WorkersAITTS(this.env.AI);
createTranscriber(connection) {
return new WorkersAIFluxSTT(this.env.AI);
}
}
TypeScript
export class MyAgent extends VoiceAgent<Env> {
tts = new WorkersAITTS(this.env.AI);
createTranscriber(connection: Connection): Transcriber {
return new WorkersAIFluxSTT(this.env.AI);
}
}
onTurn(transcript, context)
必需。 当用户说完一段话、转录完成时被调用。
返回 string、AsyncIterable<string> 或 ReadableStream 以支持流式响应。
简单响应:
JavaScript
export class MyAgent extends VoiceAgent {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
async onTurn(transcript, context) {
return "You said: " + transcript;
}
}
TypeScript
export class MyAgent extends VoiceAgent<Env> {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
async onTurn(transcript: string, context: VoiceTurnContext) {
return "You said: " + transcript;
}
}
流式响应(LLM 推荐方式):
JavaScript
import { streamText } from "ai";
import { createWorkersAI } from "workers-ai-provider";
export class MyAgent extends VoiceAgent {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
async onTurn(transcript, context) {
const workersai = createWorkersAI({ binding: this.env.AI });
const result = streamText({
model: workersai("@cf/moonshotai/kimi-k2.5"),
system: "You are a helpful voice assistant. Keep responses concise.",
messages: [
...context.messages.map((m) => ({
role: m.role,
content: m.content,
})),
{ role: "user", content: transcript },
],
abortSignal: context.signal,
});
return result.textStream;
}
}
Explain Code
TypeScript
import { streamText } from "ai";
import { createWorkersAI } from "workers-ai-provider";
export class MyAgent extends VoiceAgent<Env> {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
async onTurn(transcript: string, context: VoiceTurnContext) {
const workersai = createWorkersAI({ binding: this.env.AI });
const result = streamText({
model: workersai("@cf/moonshotai/kimi-k2.5"),
system: "You are a helpful voice assistant. Keep responses concise.",
messages: [
...context.messages.map(m => ({
role: m.role as "user" | "assistant",
content: m.content,
})),
{ role: "user", content: transcript },
],
abortSignal: context.signal,
});
return result.textStream;
}
}
Explain Code
context 对象提供:
| 字段 | 类型 | 描述 |
|---|---|---|
| connection | Connection | WebSocket 连接 |
| messages | Array<{ role: string; content: string }> | 来自 SQLite 的会话历史 |
| signal | AbortSignal | 在打断或断开连接时被中止 |
生命周期 hook
| 方法 | 描述 |
|---|---|
| beforeCallStart(connection) | 返回 false 拒绝该次通话 |
| onCallStart(connection) | 通话被接受后调用 |
| onCallEnd(connection) | 通话结束时调用 |
| onInterrupt(connection) | 用户在播放期间打断时调用 |
管线 hook
在每个管线阶段拦截和转换数据。返回 null 跳过当前这次发言。
| 方法 | 接收 | 是否可跳过 |
|---|---|---|
| afterTranscribe(transcript, connection) | STT 文本 | 是 |
| beforeSynthesize(text, connection) | TTS 之前的文本 | 是 |
| afterSynthesize(audio, text, connection) | TTS 之后的音频 | 是 |
JavaScript
import {} from "agents";
export class MyAgent extends VoiceAgent {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
afterTranscribe(transcript, connection) {
if (transcript.length < 3) return null;
return transcript;
}
beforeSynthesize(text, connection) {
return text.replace(/\bAI\b/g, "A.I.");
}
async onTurn(transcript, context) {
return transcript;
}
}
Explain Code
TypeScript
import { type Connection } from "agents";
export class MyAgent extends VoiceAgent<Env> {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
afterTranscribe(transcript: string, connection: Connection) {
if (transcript.length < 3) return null;
return transcript;
}
beforeSynthesize(text: string, connection: Connection) {
return text.replace(/\bAI\b/g, "A.I.");
}
async onTurn(transcript: string, context: VoiceTurnContext) {
return transcript;
}
}
Explain Code
便捷方法
| 方法 | 描述 |
|---|---|
| speak(connection, text) | 合成音频并发送给一个连接 |
| speakAll(text) | 合成音频并发送给所有连接 |
| forceEndCall(connection) | 程序化结束一次通话 |
| saveMessage(role, text) | 把一条消息持久化到会话历史 |
| getConversationHistory() | 从 SQLite 读取会话历史 |
配置选项
把选项作为第二个参数传给 withVoice():
JavaScript
const VoiceAgent = withVoice(Agent, {
historyLimit: 20,
audioFormat: "mp3",
maxMessageCount: 1000,
});
TypeScript
const VoiceAgent = withVoice(Agent, {
historyLimit: 20,
audioFormat: "mp3",
maxMessageCount: 1000,
});
| 选项 | 类型 | 默认值 | 描述 |
|---|---|---|---|
| historyLimit | number | 20 | 加载到上下文的最大消息数 |
| audioFormat | string | “mp3” | 发送给客户端的音频格式 |
| maxMessageCount | number | 1000 | SQLite 中存储的最大消息数 |
服务端 API:withVoiceInput
withVoiceInput(Agent) 添加只支持 STT 的语音输入 — 没有 TTS、没有 LLM、不生成响应。适用于听写、语音搜索,或任何只需要语音转文字、不需要会话 agent 的 UI 场景。
JavaScript
import { Agent } from "agents";
import { withVoiceInput, WorkersAINova3STT } from "@cloudflare/voice";
const InputAgent = withVoiceInput(Agent);
export class DictationAgent extends InputAgent {
transcriber = new WorkersAINova3STT(this.env.AI);
onTranscript(text, connection) {
console.log("User said:", text);
}
}
Explain Code
TypeScript
import { Agent } from "agents";
import { withVoiceInput, WorkersAINova3STT } from "@cloudflare/voice";
const InputAgent = withVoiceInput(Agent);
export class DictationAgent extends InputAgent<Env> {
transcriber = new WorkersAINova3STT(this.env.AI);
onTranscript(text: string, connection: Connection) {
console.log("User said:", text);
}
}
Explain Code
onTranscript(text, connection)
每次发言转录后被调用。重写它以处理转录文本。
Hooks
withVoiceInput 支持与 withVoice 相同的生命周期 hook:
beforeCallStart(connection)— 返回false表示拒绝onCallStart(connection)、onCallEnd(connection)、onInterrupt(connection)createTranscriber(connection)— 重写以支持运行时模型切换afterTranscribe(transcript, connection)— 过滤或转换转录文本
它没有 TTS hook(beforeSynthesize、afterSynthesize)或 onTurn。
客户端 API:React hooks
useVoiceAgent
封装了用于 withVoice agent 的 VoiceClient。管理连接、麦克风采集、播放、静音检测和打断检测。
import { useVoiceAgent } from "@cloudflare/voice/react";
const {
status, // "idle" | "listening" | "thinking" | "speaking"
transcript, // TranscriptMessage[] — conversation history
interimTranscript, // string | null — real-time partial transcript
metrics, // VoicePipelineMetrics | null
audioLevel, // number (0–1) — current mic RMS level
isMuted, // boolean
connected, // boolean — WebSocket connected
error, // string | null
startCall, // () => Promise<void>
endCall, // () => void
toggleMute, // () => void
sendText, // (text: string) => void — bypass STT
sendJSON, // (data: Record<string, unknown>) => void
lastCustomMessage, // unknown — last non-voice message from server
} = useVoiceAgent({
agent: "MyAgent",
name: "default",
host: window.location.host,
});
Explain Code
调优选项
| 选项 | 类型 | 默认值 | 描述 |
|---|---|---|---|
| silenceThreshold | number | 0.04 | 低于此 RMS 视为静音 |
| silenceDurationMs | number | 500 | 触发 end_of_speech 之前的静音时长(毫秒) |
| interruptThreshold | number | 0.05 | 检测到播放期间说话的 RMS 阈值 |
| interruptChunks | number | 2 | 触发打断所需的连续高 RMS 块数 |
修改调优选项会触发客户端重连(连接 key 包含这些参数)。
useVoiceInput
适用于听写和语音转文字的轻量级 hook。把所有发言累积到一个字符串中。
import { useVoiceInput } from "@cloudflare/voice/react";
function Dictation() {
const {
transcript, // string — accumulated text from all utterances
interimTranscript, // string | null — current partial transcript
isListening, // boolean
audioLevel, // number (0–1)
isMuted, // boolean
error, // string | null
start, // () => Promise<void>
stop, // () => void
toggleMute, // () => void
clear, // () => void — clear accumulated transcript
} = useVoiceInput({ agent: "DictationAgent" });
return (
<div>
<textarea
value={
transcript + (interimTranscript ? " " + interimTranscript : "")
}
readOnly
/>
<button onClick={isListening ? stop : start}>
{isListening ? "Stop" : "Dictate"}
</button>
</div>
);
}
Explain Code
客户端 API:VoiceClient
适用于非 React 环境的框架无关客户端。
JavaScript
import { VoiceClient } from "@cloudflare/voice/client";
const client = new VoiceClient({ agent: "MyAgent" });
client.addEventListener("statuschange", (status) => {
console.log("Status:", status);
});
client.addEventListener("transcriptchange", (messages) => {
console.log("Transcript:", messages);
});
client.addEventListener("error", (err) => {
console.error("Error:", err);
});
client.connect();
await client.startCall();
// Later:
client.endCall();
client.disconnect();
Explain Code
TypeScript
import { VoiceClient } from "@cloudflare/voice/client";
const client = new VoiceClient({ agent: "MyAgent" });
client.addEventListener("statuschange", (status) => {
console.log("Status:", status);
});
client.addEventListener("transcriptchange", (messages) => {
console.log("Transcript:", messages);
});
client.addEventListener("error", (err) => {
console.error("Error:", err);
});
client.connect();
await client.startCall();
// Later:
client.endCall();
client.disconnect();
Explain Code
事件
| 事件 | 数据类型 | 描述 |
|---|---|---|
| statuschange | VoiceStatus | 管线状态发生变化 |
| transcriptchange | TranscriptMessage[] | 转录文本更新 |
| interimtranscript | string | null | 来自流式 STT 的临时转录 |
| metricschange | VoicePipelineMetrics | 管线计时指标 |
| audiolevelchange | number | 麦克风音频电平 (0–1) |
| connectionchange | boolean | WebSocket 连接/断开 |
| mutechange | boolean | 静音状态变化 |
| error | string | null | 发生错误 |
| custommessage | unknown | 来自服务端的非语音消息 |
高级选项
| 选项 | 类型 | 描述 |
|---|---|---|
| transport | VoiceTransport | 自定义传输(默认通过 PartySocket 的 WebSocket) |
| audioInput | VoiceAudioInput | 自定义麦克风采集(默认是内置的 AudioWorklet) |
| preferredFormat | VoiceAudioFormat | 服务端音频格式的提示(仅作建议) |
Providers
内置(Workers AI)
无需 API key — 使用你的 Workers AI binding:
| 类 | 类型 | 默认模型 | 推荐用于 |
|---|---|---|---|
| WorkersAIFluxSTT | 持续 STT | @cf/deepgram/flux | withVoice |
| WorkersAINova3STT | 持续 STT | @cf/deepgram/nova-3 | withVoiceInput |
| WorkersAITTS | TTS | @cf/deepgram/aura-1 | 两者皆可 |
JavaScript
import { Agent } from "agents";
import {
withVoice,
WorkersAIFluxSTT,
WorkersAINova3STT,
WorkersAITTS,
} from "@cloudflare/voice";
const VoiceAgent = withVoice(Agent);
// Default usage
export class MyAgent extends VoiceAgent {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
}
// Custom options
export class CustomAgent extends VoiceAgent {
transcriber = new WorkersAIFluxSTT(this.env.AI, {
eotThreshold: 0.8,
keyterms: ["Cloudflare", "Workers"],
});
tts = new WorkersAITTS(this.env.AI, {
model: "@cf/deepgram/aura-1",
speaker: "asteria",
});
}
Explain Code
TypeScript
import { Agent } from "agents";
import {
withVoice,
WorkersAIFluxSTT,
WorkersAINova3STT,
WorkersAITTS,
} from "@cloudflare/voice";
const VoiceAgent = withVoice(Agent);
// Default usage
export class MyAgent extends VoiceAgent<Env> {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new WorkersAITTS(this.env.AI);
}
// Custom options
export class CustomAgent extends VoiceAgent<Env> {
transcriber = new WorkersAIFluxSTT(this.env.AI, {
eotThreshold: 0.8,
keyterms: ["Cloudflare", "Workers"],
});
tts = new WorkersAITTS(this.env.AI, {
model: "@cf/deepgram/aura-1",
speaker: "asteria",
});
}
Explain Code
第三方 providers
| 包 | 类 | 描述 |
|---|---|---|
| @cloudflare/voice-deepgram | DeepgramSTT | 持续 STT |
| @cloudflare/voice-elevenlabs | ElevenLabsTTS | 高质量 TTS |
| @cloudflare/voice-twilio | TwilioAdapter | 电话(电话呼叫) |
ElevenLabs TTS:
JavaScript
import { ElevenLabsTTS } from "@cloudflare/voice-elevenlabs";
export class MyAgent extends VoiceAgent {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new ElevenLabsTTS({
apiKey: this.env.ELEVENLABS_API_KEY,
voiceId: "21m00Tcm4TlvDq8ikWAM",
});
}
TypeScript
import { ElevenLabsTTS } from "@cloudflare/voice-elevenlabs";
export class MyAgent extends VoiceAgent<Env> {
transcriber = new WorkersAIFluxSTT(this.env.AI);
tts = new ElevenLabsTTS({
apiKey: this.env.ELEVENLABS_API_KEY,
voiceId: "21m00Tcm4TlvDq8ikWAM",
});
}
Deepgram STT:
JavaScript
import { DeepgramSTT } from "@cloudflare/voice-deepgram";
export class MyAgent extends VoiceAgent {
transcriber = new DeepgramSTT({
apiKey: this.env.DEEPGRAM_API_KEY,
});
tts = new WorkersAITTS(this.env.AI);
}
TypeScript
import { DeepgramSTT } from "@cloudflare/voice-deepgram";
export class MyAgent extends VoiceAgent<Env> {
transcriber = new DeepgramSTT({
apiKey: this.env.DEEPGRAM_API_KEY,
});
tts = new WorkersAITTS(this.env.AI);
}
电话(Twilio)
通过 Twilio adapter 把电话呼叫接到你的语音 agent:
Terminal 窗口
npm install @cloudflare/voice-twilio
Adapter 把 Twilio Media Streams 桥接到你的 VoiceAgent:
Phone → Twilio → WebSocket → TwilioAdapter → WebSocket → VoiceAgent
WorkersAITTS 返回 MP3,在 Workers 运行时无法解码为 PCM。使用 Twilio adapter 时,请使用输出原始 PCM 的 TTS provider(例如 ElevenLabs,设置 outputFormat: "pcm_16000")。
文本消息
withVoice agent 也可以接收文本消息,完全跳过 STT。这对于在语音之外提供聊天式输入很有用。
const { sendText } = useVoiceAgent({ agent: "MyAgent" });
// Send text — goes straight to onTurn() without STT
sendText("What is the weather like today?");
文本消息在通话中和通话外都能工作。通话中,响应通过 TTS 朗读。通话外,响应作为纯文本转录消息发送。
自定义消息
在语音协议消息之外,发送和接收应用层 JSON 消息。非语音消息会传递到服务端的 onMessage handler,并在客户端触发 custommessage 事件。
服务端:
JavaScript
export class MyAgent extends VoiceAgent {
onMessage(connection, message) {
const data = JSON.parse(message);
if (data.type === "kick_speaker") {
this.forceEndCall(connection);
}
}
}
TypeScript
export class MyAgent extends VoiceAgent<Env> {
onMessage(connection: Connection, message: WSMessage) {
const data = JSON.parse(message as string);
if (data.type === "kick_speaker") {
this.forceEndCall(connection);
}
}
}
客户端:
const { sendJSON, lastCustomMessage } = useVoiceAgent({ agent: "MyAgent" });
sendJSON({ type: "kick_speaker" });
useEffect(() => {
if (lastCustomMessage) {
console.log("Custom message:", lastCustomMessage);
}
}, [lastCustomMessage]);
单一发言者强制
用 beforeCallStart 限制谁可以发起通话。这个例子强制单一发言者 — 同一时间只允许一个连接成为活跃发言者:
JavaScript
import {} from "agents";
export class MyAgent extends VoiceAgent {
#speakerId = null;
beforeCallStart(connection) {
if (this.#speakerId !== null) {
return false;
}
this.#speakerId = connection.id;
return true;
}
onCallEnd(connection) {
if (this.#speakerId === connection.id) {
this.#speakerId = null;
}
}
}
Explain Code
TypeScript
import { type Connection } from "agents";
export class MyAgent extends VoiceAgent<Env> {
#speakerId: string | null = null;
beforeCallStart(connection: Connection) {
if (this.#speakerId !== null) {
return false;
}
this.#speakerId = connection.id;
return true;
}
onCallEnd(connection: Connection) {
if (this.#speakerId === connection.id) {
this.#speakerId = null;
}
}
}
Explain Code
管线指标
withVoice agent 在每次说话轮次结束后会发出计时指标:
const { metrics } = useVoiceAgent({ agent: "MyAgent" });
// metrics: {
// llm_ms: 850,
// tts_ms: 200,
// first_audio_ms: 950,
// total_ms: 1200,
// }
会话历史
withVoice 自动把会话消息持久化到 SQLite。在 onTurn 中通过 context.messages 访问历史,或直接调用:
JavaScript
const history = this.getConversationHistory(20);
this.saveMessage("assistant", "Welcome! How can I help?");
TypeScript
const history = this.getConversationHistory(20);
this.saveMessage("assistant", "Welcome! How can I help?");
历史挺过 Durable Object 重启和客户端重连。语音 agent 在通话期间使用 keepAlive 防止被驱逐。