Soniox
Docs
Core concepts

Customization

Learn how to use custom context to enhance trancription accuracy.

Overview

Soniox Speech-to-Text AI allows you to enhance transcription accuracy by providing custom context for each transcription session. This feature is especially useful when working with:

  • Industry-specific terminology
  • Brand names or product names
  • Uncommon names or made-up words
  • Domain-specific documents or phrases

By providing context, you help the AI model better understand and anticipate the language in your audio — even if some terms do not appear clearly or completely.


How context works

The context parameter accepts any text that may be relevant to the transcription session. This text is not required to appear in the audio — it simply acts as guidance for the model to improve recognition accuracy when necessary.

The model uses the provided context only when helpful, and it does not override normal speech recognition behavior.


Supported context types

You can supply many types of text to the context parameter, such as:

List of terms or keywords

Useful for proper nouns, technical vocabulary, or product names:

{
  "context": "Celebrex, Zyrtec, Xanax, Prilosec, Amoxicillin Clavulanate Potassium"
}

Full text or summary

Provide a paragraph, summary, or reference document related to the audio content:

{
  "context": "The customer, Maria Lopez, contacted BrightWay Insurance to update her auto policy after purchasing a new vehicle. Agent Daniel Kim reviewed the changes, explained the premium adjustment, and offered a bundling discount. Maria agreed to update the policy and scheduled a follow-up to consider additional options."
}

Context size limit

  • The context can contain up to 8,000 tokens (roughly 10,000+ characters)
  • This allows you to include substantial information, including summaries, scripts, or glossary-style entries

If the context exceeds the limit, the API will return an error — be sure to trim or summarize as needed.


Best practices

  • Use commas or spacing to separate terms in short lists
  • Keep context relevant to the session — don't overload with unrelated data
  • Preprocess content from related documents (e.g., transcripts, emails, product info) into a clean context block

Use cases

Use caseExample context
Medical transcriptionMedication names, procedure terms, doctor/patient names.
Call center recordingsCustomer name, agent info, company-specific lingo.
Industry-specific jargonTerms from legal, finance, biotech, or tech domains.
Podcasts / interviewsGuest names, brand mentions, episode summaries.
Custom words and neologismsFictional terms, product names, made-up branding.

Example: Custom word recognition

The following example demonstrates how to transcribe audio containing words Celebrex, Zyrtec, Xanax, Prilosec, Amoxicillin Clavulanate Potassium by including them in the context:

import os
import time
 
import requests
 
# Retrieve the API key from environment variable (ensure SONIOX_API_KEY is set)
api_key = os.environ["SONIOX_API_KEY"]
api_base = "https://api.soniox.com"
audio_url = "https://soniox.com/media/examples/context_demo.mp3"
 
session = requests.Session()
session.headers["Authorization"] = f"Bearer {api_key}"
 
 
def poll_until_complete(transcription_id):
    while True:
        res = session.get(f"{api_base}/v1/transcriptions/{transcription_id}")
        res.raise_for_status()
        data = res.json()
        if data["status"] == "completed":
            return
        elif data["status"] == "error":
            raise Exception(
                f"Transcription failed: {data.get('error_message', 'Unknown error')}"
            )
        time.sleep(1)
 
 
def main():
    print("Starting transcription...")
 
    res = session.post(
        f"{api_base}/v1/transcriptions",
        json={
            "audio_url": audio_url,
            "model": "stt-async-preview",
            "language_hints": ["en", "es"],
            "context": (
                "Celebrex, Zyrtec, Xanax, Prilosec, "
                + "Amoxicillin Clavulanate Potassium"
            ),
        },
    )
    res.raise_for_status()
    transcription_id = res.json()["id"]
    print(f"Transcription ID: {transcription_id}")
 
    # Poll until transcription is done
    poll_until_complete(transcription_id)
 
    # Get the transcript text
    res = session.get(f"{api_base}/v1/transcriptions/{transcription_id}/transcript")
    res.raise_for_status()
    print("Transcript:")
    print(res.json()["text"])
 
 
if __name__ == "__main__":
    main()
View example on GitHub

Output

On this page