Soniox
Shared concepts

Language restrictions

Learn about supported languages and how to specify language hints.

Overview

Soniox Speech-to-Text AI supports restricting recognition to specific languages. This is useful when your application expects speech in a known language and you want to avoid accidental transcription in other languages, especially in cases of heavy accents or ambiguous pronunciation.

Language restriction is best-effort, not a hard guarantee. While the model is strongly biased toward the specified languages, it may still occasionally output another language in rare edge cases. In practice, this happens very infrequently when configured correctly.


How language restrictions work

Language restriction is enabled using two parameters:

  • language_hints
    A list of expected spoken languages, provided as ISO language codes (e.g. en for English, es for Spanish).
  • language_hints_strict
    A boolean flag that enables language restriction based on the provided hints.

When language_hints_strict is set to true, the model will strongly prefer producing output only in the specified languages.

Best results are achieved when specifying a single language.


✅ Use a single language whenever possible

Language restriction is most robust when only one language is provided. This is strongly recommended for production use.

For example, restricting to English only:

{
  "language_hints": ["en"],
  "language_hints_strict": true
}

⚠️ Multiple languages reduce robustness

You may specify multiple languages, but accuracy can degrade when language identification becomes ambiguous, especially with heavy accents or acoustically similar languages.

Example (English + Spanish):

{
  "language_hints": ["en", "es"],
  "language_hints_strict": true
}

In difficult cases (e.g. heavily accented English spoken by a Hindi speaker), the model may still choose the “wrong” language and transcribe using the wrong script. This is why single-language restriction is strongly recommended when correctness is critical.


When to use language restrictions

Use language restriction when:

  • Your application expects only one known language
  • You want to avoid transliteration into the wrong alphabet
  • You want higher accuracy than using language_hints alone
  • You are processing speech with strong accents

Language restriction provides a stronger signal than language hints without restriction.


Language identification behavior

Automatic language identification is still technically active when language restriction is enabled. However:

Language restriction is intended for cases where the spoken language is already known.

If you need full automatic language detection across many languages, do not enable strict language restriction.


Supported languages

See the full list of supported languages and their ISO codes in the supported languages section.


Supported models

Language restriction is supported on:

  • stt-rt-v3