LiveKit
How to use Soniox Speech-to-Text AI with LiveKit
Overview
Soniox Speech-to-Text AI turns audio into highly accurate text in real time. Paired with LiveKit, you can create powerful, responsive voice agents.
Use Soniox in your LiveKit agents to:
- Transcribe live audio from voice or video sessions in real time
- Build custom voice agents powered by Soniox
- Deploy voice-driven experiences at enterprise scale
All at lightning speed.
Getting started
To use Soniox with LiveKit, you'll need LiveKit user account.
Installation
Soniox provides Speech-to-Text through a WebSocket API, which is integrated into the official LiveKit Python plugin.
Install LiveKit Agents library from PyPI:
Get Soniox API key
The Soniox plugin requires an API key to authenticate. You can find your API key in the Soniox Console.
Set Soniox API key in your .env
file:
Usage
Use Soniox STT in an AgentSession
or as a standalone transcription service:
Congratulations! You are now ready to use Soniox STT in your LiveKit agents.
Advanced usage
Language hints
There's no need to specify a language in advance — the model automatically detects and transcribes any supported language. It also handles multilingual audio effortlessly, even when multiple languages appear within the same sentence or conversation.
If you already know which languages are likely to be spoken, you can provide language hints to help the model prioritize those languages and improve accuracy:
See list of supported languages for a list of supported languages.
You can learn more about language hints here.
Customization with context
By providing context, you help the AI model better understand and anticipate the language in your audio - even if some terms do not appear clearly or completely.
Learn more about customizing with context here.