The Voice AI Wiki

A knowledge base for voice AI: speech recognition, synthesis, real-time translation, voice agents, and everything in between.

Foundations

Orientation: what the field is, what the words mean, and how it got here, before you drill into a topic.

The core. Every concept that sits between a microphone and a transcript.

Making machines speak: how voices are built, streamed, and judged.

Translating speech as it happens, before the sentence is even finished.

Closing the loop: software that listens, decides, and speaks back fast enough to hold a conversation.

Everything you can pull out of audio once the words are no longer the point.

Terms used through The Voice AI Wiki