Confidence scores
Learn how to use confidence score of recognized tokens.
Overview
Soniox Speech-to-Text AI provides a confidence score for each recognized token (word or sub-word) in the transcript. The confidence score represents the AI model's estimate of how likely the token was recognized correctly.
Confidence values range from 0.0
to 1.0
, where:
1.0
indicates very high confidence0.0
indicates very low confidence
Lower confidence scores may indicate that the token was difficult to recognize, possibly due to background noise, accents, unclear speech, or uncommon vocabulary.
Confidence scores can be used to assess transcription quality, highlight uncertain words, or post-process results for improved accuracy.
Output format
Each token in the API response includes:
text
: The recognized token (word or sub-word)confidence
: The confidence score for the token
Example response
In this example, the word "beautiful" is split into three tokens, each with an associated confidence score:
Use cases
Use case | Description |
---|---|
Quality control | Flag tokens or phrases with low confidence for review. |
Post-processing | Trigger alternate processing (e.g., fuzzy matching, human verification) when confidence is low. |
Visual feedback | Display less confident words in a different style (e.g., faded or underlined) in user interfaces. |
Analytics | Analyze overall transcription confidence across sessions. |