Soniox
Integrations

LiveKit

How to use Soniox Speech-to-Text AI with LiveKit

Soniox x LiveKit

Overview

Soniox Speech-to-Text AI turns audio into highly accurate text in real time. Paired with LiveKit, you can create powerful, responsive voice agents.

Use Soniox in your LiveKit agents to:

  • Transcribe live audio from voice or video sessions in real time
  • Build custom voice agents powered by Soniox
  • Deploy voice-driven experiences at enterprise scale

All at lightning speed.


Getting started

To use Soniox with LiveKit, you'll need LiveKit user account.


Installation

Soniox provides Speech-to-Text through a WebSocket API, which is integrated into the official LiveKit Python plugin.

Install LiveKit Agents library from PyPI:

pip install livekit-plugins-soniox

Get Soniox API key

The Soniox plugin requires an API key to authenticate. You can find your API key in the Soniox Console. Set Soniox API key in your .env file:

SONIOX_API_KEY=<your_soniox_api_key>

Usage

Use Soniox STT in an AgentSession or as a standalone transcription service:

from livekit.plugins import soniox

session = AgentSession(
    stt = soniox.STT(),
    # ... llm, tts, etc.
)

Congratulations! You are now ready to use Soniox STT in your LiveKit agents.


Advanced usage

Language hints

There's no need to specify a language in advance — the model automatically detects and transcribes any supported language. It also handles multilingual audio effortlessly, even when multiple languages appear within the same sentence or conversation.

If you already know which languages are likely to be spoken, you can provide language hints to help the model prioritize those languages and improve accuracy:

from livekit.plugins import soniox

options = soniox.STTOptions(
    language_hints=["en", "es"],
)

session = AgentSession(
    stt = soniox.STT(params=options),
)

See list of supported languages for a list of supported languages.

You can learn more about language hints here.

Customization with context

By providing context, you help the AI model better understand and anticipate the language in your audio - even if some terms do not appear clearly or completely.

from livekit.plugins import soniox

options = soniox.STTOptions(
    context="Celebrex, Zyrtec, Xanax, Prilosec, Amoxicillin Clavulanate Potassium",
)

session = AgentSession(
    stt = soniox.STT(params=options),
    ...
)

Learn more about customizing with context here.