Voice Calls

How It Works

The MCP server is infrastructure only — it relays text between the caller and your AI client, but never generates AI responses itself. Your client is the brain; the server is the telephone.

Live Voice Call Flow

Inbound Call

-> Twilio webhook hits /webhooks/:agentId/voice

-> Server returns ConversationRelay TwiML

-> Twilio opens WebSocket to /webhooks/:agentId/voice-ws

-> Human speaks -> Twilio STT -> Text

-> Text sent to your AI client (via MCP sampling)

-> Client responds with text

-> Text sent to Twilio -> Twilio TTS -> Human hears

Twilio handles STT and TTS. The server only passes text back and forth.

Three Response Paths

PathWhenWhat Happens
Client SamplingClient connected via SSECaller's speech goes to client via MCP
Answering MachineClient not connected, Anthropic key setBuilt-in Claude fallback collects message
Hard-coded FallbackClient not connected, no keyPlays "unavailable" message

Making Outbound Calls

{

"agentId": "my-client",

"to": "+15559876543",

"greeting": "Hi, this is your AI assistant calling about your appointment.",

"systemPrompt": "You are a friendly appointment reminder assistant."

}

Once answered, a live two-way conversation begins using the same ConversationRelay flow.


Answering Machine

When the AI client is not connected (8-second timeout):

  1. Apologizes to the caller on behalf of the client
  2. Asks for message and preferences (e.g., "call me back after 8 AM")
  3. Stores everything in the dead letter queue

When the client reconnects, dead letters are automatically dispatched via comms_get_waiting_messages.

ANTHROPIC_API_KEY=sk-ant-...    # Required for answering machine

Without the key, callers hear a hard-coded "unavailable" message.


Voice Messages (TTS)

Pre-recorded messages (not live conversations):

{

"agentId": "my-client",

"to": "+15559876543",

"text": "Reminder: your appointment is tomorrow at 3 PM."

}

Generates TTS audio and delivers as a phone call.


Call Transfer

{

"agentId": "my-client",

"callSid": "CAxxxxxxxx",

"to": "+15551234567",

"announcementText": "Connecting you to a huma client now."

}

Voice Configuration

DEFAULT_VOICE_GREETING="Hello, how can I help you today?"

DEFAULT_VOICE_ID=EXAVITQu4vr4xnSDxMaL

DEFAULT_VOICE_LANGUAGE=en-US

All settings can be overridden per call via tool parameters.


Compliance

← Home