Real-time speech-to-speech
Ultra-low-latency conversations with sub-second round trips. The pause between question and answer is shorter than a human "um".
< 800ms RTT
● OPEN SOURCE / SELF-HOSTABLE / ASTERISK-NATIVE
Agent Voice Response turns any Asterisk PBX into a real-time, interruptible, sub-second voice AI agent. Your providers, your servers, your rules. Free forever.
Built as composable microservices: a core that speaks Asterisk's AudioSocket natively, with hot-swappable ASR, LLM and TTS engines. Run it all on your own metal with Ollama and Vosk, or wire in OpenAI Realtime — same dialplan, your call.
No license keys. No per-minute tax. No vendor gravity. AVR is free for personal and commercial use, developed in the open, and already answering calls for thousands of developers and businesses.
Ultra-low-latency conversations with sub-second round trips. The pause between question and answer is shorter than a human "um".
< 800ms RTTCallers can interrupt mid-sentence and the agent yields instantly — barge-in handling that makes the conversation feel human, not IVR.
BARGE-IN NATIVEAI-powered noise cancellation and echo suppression scrub the line before the model ever hears it. Call centers sound like studios.
DENOISE + ECHOFirst-class AudioSocket support for Asterisk 18+. Drops into FreePBX, VitalPBX, Vicidial, Elastix and custom dialplans without adapters.
AUDIOSOCKET / 18+Every stage of the pipeline is hot-swappable. Mix cloud horsepower with local privacy — per agent, per call, per line of dialplan.
Answer every call on the first ring, at 3 a.m., in any language, without a queue.
Deflect tier-one volume on Vicidial and FreePBX floors while agents take the calls that need a human.
Appointment booking and triage lines that run on your own servers — where patient audio stays.
Take orders, check stock, and upsell over the phone with an agent that never misquotes a price.
Language drills and training hotlines with a tutor that listens, corrects, and never gets tired.
A phone number as a control plane — call your building, your fleet, your lab, and tell it what to do.
AVR is provided free of charge for personal and commercial use. The code is on GitHub, the images are on Docker Hub, and the roadmap is argued about in public on Discord.
If it saves your team money, you can buy the maintainers a coffee — donations are voluntary and buy you exactly nothing extra. That's the point.
Audio streams from Asterisk into AVR Core over AudioSocket. Voice activity detection segments speech as it happens, ASR transcribes incrementally, the LLM streams its reply, and TTS speaks it back — all stages pipelined so the total round trip stays under a second. With realtime providers like OpenAI Realtime or Ultravox, ASR/LLM/TTS collapse into a single speech-native model.
Yes. AVR speaks AudioSocket, which is native to Asterisk 18+. Anything built on Asterisk — FreePBX, VitalPBX, Vicidial, Elastix, or a hand-rolled dialplan — can route calls to an AVR agent with a few lines of configuration.
Completely. Pair Vosk or Silero for recognition, Ollama for reasoning, and CoquiTTS or Kokoro for synthesis. Caller audio never leaves your network — which is why healthcare and finance teams pick this configuration.
AVR's voice activity detection runs continuously, even while the agent is speaking. The instant a caller starts talking, playback stops and the new utterance takes priority — the barge-in behaviour you'd expect from a human operator.
AVR is a set of stateless microservices — core, ASR, LLM, TTS — shipped as Docker images. Scale each stage horizontally behind your load balancer; the architecture is the same one running high-volume contact-center floors today.
Pull the images from Docker Hub, point the compose file at your providers, and add an extension to your dialplan. The documentation walks through a working agent in minutes, and the Discord is where everyone compares configs.