Confidence scores

Overview

Soniox Speech-to-Text AI provides a confidence score for every recognized token (word or sub-word) in the transcript. The confidence score represents the model’s estimate of how likely the token was recognized correctly.

Confidence values are floating-point numbers between 0.0 and 1.0:

1.0 → very high confidence.
0.0 → very low confidence.

Low confidence values typically occur when recognition is uncertain due to factors like background noise, heavy accents, unclear speech, or uncommon vocabulary.

You can use confidence scores to:

Assess overall transcription quality.
Flag or highlight uncertain words in a transcript.
Trigger post-processing, e.g., request user confirmation or re-check with additional context.

Confidence scores are always included by default — no extra configuration needed.

Output format

Each token in the API response includes:

text → the recognized token.
confidence → the confidence score for that token.

Example response

In this example, the word “Beautiful” is split into three tokens, each with its own confidence score:

{
  "tokens": [
    {"text": "Beau", "confidence": 0.82},
    {"text": "ti",   "confidence": 0.87},
    {"text": "ful",  "confidence": 0.98}
  ]
}

Overview

Output format

Example response

On this page