Models
Learn about latest models, changelog, and deprecations.
Soniox Speech-to-Text AI provides multiple models for real-time and asynchronous transcription and translation. This page lists the currently available models, their capabilities, and important updates.
Current models
Model | Type | Status |
|---|---|---|
| stt-rt-v3 | Real-time | Active |
| stt-async-v3 | Async | Active |
Aliases
Aliases provide a stable reference so you don’t need to change your code when newer versions are released.
| Alias | Points to | Notes |
|---|---|---|
| stt-rt-v3-preview | stt-rt-v3 | Always points to the latest real-time active model |
| stt-rt-preview-v2 | stt-rt-v3 | |
| stt-async-preview-v1 | stt-async-v3 |
Changelog
October 31, 2025
Model retirement and upgrade
We have accelerated the retirement of older models following the overwhelmingly positive response to the new v3 models. The following models have been retired:
- stt-async-preview-v1
- stt-rt-preview-v2
Both models have been aliased to the new Soniox v3 models. This means all existing requests using the old model names are now automatically served with v3, giving every user our most accurate, capable, and intelligent voice AI experience, without any code changes required.
Context compatibility
The context feature is now backward compatible with v3 models, ensuring smooth migration from older versions. However, we strongly recommend updating to the new context structure for best results and future flexibility. Learn more about context.
October 29, 2025
Model update: v3 enhancements
Applies to: stt-rt-v3, stt-async-v3
New features
- Extended audio duration support: both real-time (stt-rt-v3) and asynchronous (stt-async-v3) models now support audio up to 5 hours in a single request.
Quality improvements
- Higher transcription accuracy across challenging audio conditions and diverse languages.
Notes
- No API changes are required; existing integrations continue to work seamlessly.
- For asynchronous processing, large files up to 5 hours can now be uploaded directly without chunking.
- For real-time streaming, sessions up to 5 hours are supported under the same WebSocket connection.
October 21, 2025
New models: stt-rt-v3, stt-async-v3
Replaces: stt-rt-preview-v2, stt-async-preview-v1
Overview
The v3 models introduce major improvements across recognition, translation, and reasoning — making Soniox faster, more accurate, and more capable than ever before.
These models power real-time and asynchronous speech processing in 60+ languages, with enhanced accuracy, robustness, and context understanding.
Key improvements
- Higher transcription accuracy across 60+ languages
- Improved multilingual switching — seamless recognition when speakers change language mid-sentence
- Significantly higher translation quality, especially for languages such as German and Korean
- The async model now also supports translation
- Support for new advanced structured context, enabling richer domain- and task-specific adaptation
- Enhanced alphanumeric accuracy (addresses, IDs, codes, serials)
- More accurate speaker diarization, even in overlapping speech
- Extended maximum audio duration to 5 hours for both async and real-time models
API compatibility
- The v3 models are fully compatible with the existing Soniox API, if you are not using the context feature.
- To upgrade, simply replace the model name in your API request:
{ "model": "stt-rt-v3" }for real-time{ "model": "stt-async-v3" }for async
- If you are using the context feature, update to the new structured context for improved accuracy.
Deprecation notice
The following preview models are deprecated and will be retired on November 30, 2025:
- stt-async-preview-v1
- stt-rt-preview-v2
Please migrate to the v3 models before that date to ensure uninterrupted service.
August 15, 2025
- Deprecated
stt-rt-preview-v1
August 5, 2025
- Released
stt-rt-preview-v2- Higher transcription accuracy
- Improved translation quality
- Expanded to support all translation pairs
- More reliable automatic language switching
- Replaces: stt-rt-preview-v2, stt-async-preview-v1