Pipecat
Integrate Soniox Speech-to-Text and Text-to-Speech into Pipecat pipelines.

Overview
Pipecat is a framework for building voice-enabled, real-time, multimodal AI applications. A typical Pipecat pipeline for voice applications looks like this:
- Send Audio - Transmit and capture streamed audio from the user.
- Transcribe Speech - Convert speech to text as the user is talking.
- Process with LLM - Generate responses using a large language model.
- Convert to Speech - Transform text responses into natural speech.
- Play Audio - Stream the audio response back to the user.
Soniox plugs into two stages of this pipeline:
SonioxSTTServicehandles the transcribe speech step using the Soniox real-time STT API.SonioxTTSServicehandles the convert to speech step using the Soniox real-time TTS API.
For more details on how Pipecat works, check the Pipecat documentation.
Installation
Install the Soniox extras for Pipecat:
You will also need to set up your Soniox API key as an environment variable:
You can obtain a Soniox API key by signing up at the Soniox Console.
Services
Use SonioxSTTService to transcribe user audio in real time, with language hints, context, and speaker diarization.
Use SonioxTTSService to synthesize natural speech in 60+ languages over a streaming WebSocket connection.
Compose Soniox STT and TTS into a complete voice agent.
Guides
Compose Soniox STT and TTS into a complete voice agent.
Swap your existing STT or TTS provider for Soniox in an existing Pipecat bot.