Language restrictions
Learn about supported languages and how to specify language hints.
Overview
Soniox Speech-to-Text AI supports restricting recognition to specific languages. This is useful when your application expects speech in a known language and you want to avoid accidental transcription in other languages, especially in cases of heavy accents or ambiguous pronunciation.
Language restriction is best-effort, not a hard guarantee. While the model is strongly biased toward the specified languages, it may still occasionally output another language in rare edge cases. In practice, this happens very infrequently when configured correctly.
How language restrictions work
Language restriction is enabled using two parameters:
language_hints
A list of expected spoken languages, provided as ISO language codes (e.g.enfor English,esfor Spanish).language_hints_strict
A boolean flag that enables language restriction based on the provided hints.
When language_hints_strict is set to true, the model will strongly prefer producing output only in the specified languages.
Best results are achieved when specifying a single language.
Recommended usage
✅ Use a single language whenever possible
Language restriction is most robust when only one language is provided. This is strongly recommended for production use.
For example, restricting to English only:
⚠️ Multiple languages reduce robustness
You may specify multiple languages, but accuracy can degrade when language identification becomes ambiguous, especially with heavy accents or acoustically similar languages.
Example (English + Spanish):
In difficult cases (e.g. heavily accented English spoken by a Hindi speaker), the model may still choose the “wrong” language and transcribe using the wrong script. This is why single-language restriction is strongly recommended when correctness is critical.
When to use language restrictions
Use language restriction when:
- Your application expects only one known language
- You want to avoid transliteration into the wrong alphabet
- You want higher accuracy than using
language_hintsalone - You are processing speech with strong accents
Language restriction provides a stronger signal than language hints without restriction.
Language identification behavior
Automatic language identification is still technically active when language restriction is enabled. However:
Language restriction is intended for cases where the spoken language is already known.
If you need full automatic language detection across many languages, do not enable strict language restriction.
Supported languages
See the full list of supported languages and their ISO codes in the supported languages section.
Supported models
Language restriction is supported on:
stt-rt-v3