Context
Learn how to use custom context to enhance trancription accuracy.
Overview
Soniox Speech-to-Text AI lets you improve both transcription and translation accuracy by providing context with each session.
Context helps the model understand your domain, recognize important terms, and apply custom vocabulary and translation preferences.
Think of it as giving the model your world — what the conversation is about, which words are important, and how certain terms should be translated.
Context sections
You provide context through the context
object that can include up to four sections,
each improving accuracy in different ways:
Section | Type | Description |
---|---|---|
general | array of JSON objects | Structured key-value information (domain, topic, intent, etc.) |
text | string | Longer free-form background text or related documents |
terms | array of strings | Domain-specific or uncommon words |
translation_terms | array of JSON objects | Custom translations for ambiguous terms |
All sections are optional — include only what's relevant for your use case.
General
General information provides baseline context which guides the AI model. It helps the model adapt its vocabulary to the correct domain, improving transcription and translation quality and clarifying ambiguous words.
It consists of structured key-value pairs describing the conversation domain, topic, intent, and other relevant metadata such as participant's names, organization, setting, location, etc.
Example
Text
Provide longer unstructured text that expands on general information — examples include:
- History of prior interactions with a customer.
- Reference documents.
- Background summaries.
- Meeting notes.
Example
Transcription terms
Improve transcription accuracy of important or uncommon words and phrases that you expect in the audio — such as:
- Domain or industry-specific terminology.
- Brand or product names.
- Rare, uncommon, or invented words.
Example
Translation terms
Control how specific words or phrases are translated — useful for:
- Technical terminology.
- Entity names.
- Words with ambiguous domain-specific translations.
- Idioms and figurative speech with non-literal meaning.
Example for English → Spanish translation
Tips
- Provide domain and topic in the
general
section for best accuracy. - Keep
general
short — ideally no more than 10 key-value pairs. - Use
terms
to ensure consistent spelling and casing of difficult entity names. - Use
translations
to preserve terms like names or brands unchanged, e.g.,"St John's"
→"St John's"
.
Context size limit
- Maximum 8,000 tokens (~10,000 characters).
- Supports large blocks of text: glossaries, scripts, domain summaries.
- If you exceed the limit, the API will return an error → trim or summarize first.
Context deprecated
Overview
This documentation applies to the deprecated models: stt-async-preview-v1
and stt-rt-preview-v2
.
Soniox Speech-to-Text AI lets you boost both transcription and translation accuracy by providing context with each session.
Context is extra text that guides the AI model with domain knowledge, vocabulary, and phrases. It is especially helpful when your audio includes:
- Industry-specific terminology.
- Brand or product names.
- Rare, uncommon, or invented words.
- Domain-specific documents, scripts, or phrases.
With context, you can adapt Soniox instantly to your domain — no training required.
How context works
You provide context through the context
parameter:
- The text does not need to appear in the audio.
- Context is used only when helpful — it does not override normal recognition or translation.
- Improves accuracy for both transcription and translation.
Examples
Keyword list
Helpful for drug names, product names, or technical vocabulary:
Paragraph or summary
Provide a relevant text that reflects the audio content: