Models
Learn about latest Text-to-Speech models, changelog, and deprecations.
Soniox Text-to-Speech is built for the hardest parts of speech generation. It delivers native-speaker-quality speech in 60+ languages, with hallucination-free output and accurate pronunciation of alphanumerics such as phone numbers, email addresses, and IDs.
This page lists the currently available models, along with release notes and important updates.
Current models
Model | Type | Status |
|---|---|---|
| tts-rt-v1-preview | Real-time | Active |
Changelog
April 23, 2026
Overview
tts-rt-v1-preview is the first Soniox Text-to-Speech model, released in preview to gather developer feedback and guide further improvements before general availability.
Key capabilities
- Native-speaker-quality speech in 60+ languages
- Hallucination-free generation, with no invented words, dropped content, or unexpected substitutions
- Accurate rendering of alphanumerics such as email addresses, phone numbers, street addresses, IDs, and codes
- Streaming generation before the sentence ends for ultra-low-latency voice systems
- Multiple voices that work across all supported languages
- Configurable audio formats, sample rates, and bitrates
- Support for both WebSocket and REST APIs