Orchestrate AI Agent Communication

Phone calls, SMS, email, WhatsApp — provisioned, routed, billed, and compliant. One server, any provider, any agent.

Get Started Free View on GitHub

Anyone Can Make an API Call

But production AI communication needs more than a wrapper.

Vendor Lock-in

Hard-coded to one provider. Switching means rewriting your entire integration from scratch.

No Fallback

Agent disconnects mid-call and the caller hears silence. No voicemail, no transfer, no recovery.

No Compliance

TCPA time-of-day rules, DNC lists, CAN-SPAM, GDPR consent — gaps that become lawsuits.

No Billing

No per-agent cost tracking, no spending caps, no way to monetize when you deploy for agents.

No Security

Unsigned webhooks, no rate limiting, no replay prevention. Open doors for abuse.

Single Language

Caller speaks Spanish, your agent only works in English. No translation, no reach.

Infrastructure, Not Intelligence

Your AI agent is the brain. Butt-Dial is the telephone system.

🤖 AI Agent Your LLM decides what to say
📞 Butt-Dial Server MCP communication layer
🔌 Providers Twilio, Vonage, Resend…
👤 Human Calls, texts, emails

Your agent decides what to say

The server never generates AI responses. It handles transport, compliance, and delivery. Your agent stays in control.

Swap providers at config time

Twilio, Vonage, Resend, ElevenLabs, OpenAI TTS — all pluggable. Switch in config, not in code.

Provision in seconds

One API call, under 10 seconds. Phone number, SMS, email, WhatsApp — all channels ready.

What Makes It Different

Features that take months to build. Included.

🎤

5 Voice Services

Complete voice infrastructure. From one-way recordings to live AI conversations and conference calls.

  • Voice Message — one-way audio via phone, WhatsApp, LINE, Telegram, Viber
  • Secretary — built-in LLM answers calls when agent is offline, reports context back
  • Agent (Live) — agent controls the call in real-time via STT/TTS
  • Bridge — connect parties: human+human, human+agent, conference calls
  • Impersonator — LLM mimics the agent's voice & personality, relays text live
🌐

Real-Time Translation

Per-agent language settings. Caller speaks one language, agent works in another. Translated in both directions.

  • Works on voice, SMS, WhatsApp, and email
  • Available for Human to human communication
  • Set per agent, not per account
  • No extra API — built into the pipeline

Voice Tool Use

Your AI agent takes real actions during a live phone call. Not after — during.

  • Send an SMS while on a call
  • Fire off a confirmation email
  • Transfer to a human when needed
  • Trigger a webhook or any MCP tool
🔀

Zero Vendor Lock-in

Pluggable provider architecture. Swap telephony, email, TTS, or STT providers without touching application code.

  • Twilio ↔ Vonage for calls/SMS
  • Resend ↔ SendGrid for email
  • ElevenLabs ↔ OpenAI for TTS
  • Change in config, deploy, done
🔒

Privacy First

Self-hosted by design. Message content passes through — never stored. Credentials encrypted at rest. Logs redacted automatically.

  • Self-hosted — data never leaves your server
  • Message content passes through, never stored
  • Credentials encrypted with AES-256
  • Phone numbers and emails redacted in logs
💰

Built-in Billing Engine

Per-agent cost tracking with tiered plans. Deploy for agents and monetize from day one.

  • 4 tiers: Free, Starter, Pro, Enterprise
  • Offshore communication at local prices
  • Per-agent and per-team spending caps
  • Usage dashboards with cost breakdown
🎤

2-Way Voice Messages

Send and receive WhatsApp voice notes. Text in, voice out. Voice in, text out. Automatic transcription with CC subtitles.

  • Agent sends text → TTS → OGG/Opus voice note
  • Human sends voice → STT transcription → agent gets text
  • CC subtitles: agent always sees readable text
  • Works with any TTS (Edge, ElevenLabs, OpenAI) and STT (Deepgram)
🔗

Isolated Sessions

Every conversation is an isolated tunnel. Thousands of concurrent sessions per agent. Messages never leak between sessions.

  • Auto-created on first inbound message
  • Full context delivered: participants, history, metadata
  • Multi-party: 3+ participants, observer roles
  • Durable — survives server restarts
🔐

Cross-Channel Identity (OTP)

Verify who you're talking to. Send OTP on one channel, verify on another. The cross-channel identity bridge.

  • Unknown voice caller → OTP via SMS → verified
  • New WhatsApp number → OTP to known email → linked
  • Trust sessions with configurable duration (30min–7 days)
  • DTMF support: enter OTP via phone keypad
🤖

Agent-to-Agent Messaging

Agents find and message each other directly. Build multi-agent workflows where AI agents collaborate without human intermediaries.

  • Agent directory: search by name or list all
  • Direct messaging via live-delivery (channel: A2A)
  • Same session isolation and context delivery
  • Works across teams with scoped access
💬

WhatsApp Rich Actions

Beyond text: location pins, contact cards, polls, interactive buttons, typing indicators, read receipts, reactions, and more.

  • 10 actions: location, contact, typing, read, react, edit, delete, forward, poll, buttons
  • Self-hosted WhatsApp via WAHA (free, unlimited)
  • GreenAPI and Twilio also supported
  • Docker one-command deploy with WAHA

Production Security. Zero Dependencies.

No helmet. No cors package. Every security layer built from scratch.

🔒 Security

  • Bearer token authentication on all admin routes
  • Webhook signature verification (Twilio, Vonage)
  • Replay attack prevention with nonce cache
  • AES-256-GCM encryption for sensitive data
  • Brute-force lockout (10 failures → 15-min ban)
  • Anomaly detection running every 60 seconds
  • Per-IP and per-route rate limiting
  • Input sanitization on all endpoints
  • IP-based admin access filtering

📜 Compliance

  • TCPA time-of-day calling restrictions
  • Do Not Call (DNC) list enforcement
  • GDPR consent tracking and erasure
  • CAN-SPAM compliant email handling
  • Content filtering and guardrails
  • Recording consent management
  • SHA-256 tamper-proof audit trail
  • Per-agent compliance configuration

Up and Running in Minutes

Three steps from zero to a fully connected AI agent.

🔐
Connect
Configure
💬
Communicate
1

Connect

Create an account and get your security token. Link your AI agent via MCP in under a minute.

2

Configure

Add your Twilio, Vonage, or Resend credentials in the admin panel. Test with one click.

3

Communicate

Your AI agent can now call, text, email, and WhatsApp — across 5 channels, 16 languages.

Built to Operate

Everything you need to monitor, debug, and run in production.

📊

Admin Dashboard

Real-time view of agents, calls, messages, and system health.

📈

Prometheus Metrics

Export to Grafana, Datadog, or any metrics backend.

📋

Structured Logging

JSON logs with correlation IDs across every request.

🔄

Demo Mode

Test everything without live API calls. Safe for development.

Why Butt-Dial

The missing infrastructure layer between AI agents and the real world.

For Investors

  • Infrastructure play — like Twilio was for developers, Butt-Dial is for AI agents
  • Revenue built-in — per-channel markup on every message, call, and email
  • Multi-tenant SaaS — org isolation, per-agent billing, tier system
  • 11 provider adapters — no vendor lock-in, maximum negotiating leverage
  • 49 MCP tools — deepest integration surface for AI agents
  • 5 voice modes — from zero-config to sub-1s Gemini Live

For Developers

  • One MCP server — call, text, email, WhatsApp from any AI agent
  • Zero glue code — provision channels, send messages, make calls via tools
  • Pluggable everything — swap Twilio for Vonage, ElevenLabs for Deepgram, in config
  • Production-ready — auth, rate limiting, compliance, audit logging included
  • Self-hosted — your server, your data, your rules
  • 16 languages — auto-detected, auto-routed to the right voice
49
MCP Tools
5
Channels
11
Provider Adapters
16
Languages

Voice AI — Multiple Engines, One Interface

Choose the right voice pipeline for your use case. Switch anytime from the admin panel.

Gemini Live

Single model. STT+LLM+TTS in one WebSocket. Sub-1s latency. 70+ languages.

🎙

ElevenLabs Agents

Premium voice quality. Custom LLM brain. 5000+ voices. Managed platform.

🔨

Build Your Own

Mix any STT + LLM + TTS. Deepgram, Cartesia, OpenAI — your choice, your control.

Self-Hosted. Open Source. Your Data.

Deploy anywhere. No usage fees, no vendor dashboards, no data leaving your network.