Vocode homepage

Vocode

Open-source framework for building real-time voice LLM agents

FrameworkAssistantdeprecated

Last reviewed 2026-06-19

Vocode is an open-source (MIT) Python framework for building real-time voice conversational AI agents. It orchestrates the speech-to-text, LLM, and text-to-speech loop over streaming audio, handling the hard real-time problems such as latency, endpointing, and interruptions (barge-in), and lets developers swap providers for each stage. It supports telephony (Twilio, Vonage), web, and Zoom dial-in, and a few lines of code can stand up a voice agent. Founded in 2023 in San Francisco by Ajay Raj and Kian Hooshmand (Y Combinator W23), Vocode also offered a hosted commercial API. As of this review the project is effectively unmaintained: the last commit was in November 2024 and the last release in mid-2024, and the marketing site redirects to GitHub, consistent with the hosted product being wound down. It is documented here as deprecated; teams building new voice agents typically use actively maintained alternatives.

What it can do

  • Orchestrate real-time voice conversations

    Assistant

    Wires speech-to-text, an LLM, and text-to-speech over streaming audio, handling latency and conversation state as a developer building block.

    source
  • Place and receive phone calls

    Supervised

    Connects to telephony providers (Twilio, Vonage) and Zoom to run voice agents over the phone once configured.

    source
  • Swap STT, TTS, and LLM providers

    Assistant

    Provider-agnostic configuration lets developers mix Deepgram, ElevenLabs, Cartesia, OpenAI, Anthropic, and others.

    source

Strengths

  • +Genuinely modular and provider-agnostic across a broad STT/TTS/LLM menu
  • +MIT-licensed and fully self-hostable with no lock-in
  • +Solves the hard real-time voice problems (latency, endpointing, interruptions)

Limitations

  • Effectively unmaintained: no commits since November 2024, making it production-risky
  • The hosted product appears wound down and the marketing site redirects to GitHub
  • Low-level and Python-centric; you operate an unmaintained codebase yourself

Overview

Vocode is an open-source (MIT) Python framework for building real-time voice conversational AI agents. It is documented here as deprecated: the project is effectively unmaintained.

What it does

Vocode orchestrates the speech-to-text, LLM, and text-to-speech loop over streaming audio, handling latency, endpointing, and interruptions, and lets developers swap providers for each stage. It supports telephony (Twilio, Vonage), web, and Zoom dial-in, with a few lines of code to start.

Status

The last commit was in November 2024 and the last release in mid-2024, and the vocode.dev site redirects to GitHub, consistent with the hosted commercial product being wound down. No formal shutdown or acquisition was announced; the repo is not archived but is inactive.

Integrations & setup

Provider-agnostic across STT (Deepgram, AssemblyAI, Whisper), TTS (ElevenLabs, Cartesia, Play.ht), and LLMs (OpenAI, Anthropic), with telephony via Twilio, Vonage, and Zoom. Self-hostable under MIT.

Best for / not for

Not recommended for new production builds given its unmaintained state. Teams typically use actively maintained voice frameworks and platforms instead.

Traction

Vocode went through Y Combinator (W23) and raised a reported ~$3.25M seed in early 2024 (investors include Base10 and Accel) before going quiet.

Alternatives

Vapi, Retell AI, and Bland AI are actively maintained voice-agent alternatives.

What people are saying

We aggregate real LinkedIn discussion into sentiment for the agents people search most. Vocode isn't tracked yet, want it added? Request tracking.

FAQ

Is Vocode still maintained?+

No. As of this review the project is effectively unmaintained: the last commit was in November 2024 and the marketing site redirects to GitHub, consistent with the hosted product being wound down. It is documented here as deprecated.

What is Vocode for?+

It is an open-source Python framework for building real-time voice agents by orchestrating speech-to-text, an LLM, and text-to-speech over streaming audio, with swappable providers.

Sources

Last reviewed 2026-06-19

Alternatives & related