Live speech translation API
Translate live speech across 60+ languages and 3,600 language pairs with ultra-low latency, high quality, and true real-time streaming.
Trusted by
Built for the hardest parts of speech translation
Speech translation breaks when systems wait too long, mistranscribe speech, or only work well for a few major languages.
Real conversations are messy. People speak with accents, switch languages, say names, addresses, emails, phone numbers, IDs, and domain-specific terms. In live conversation, every second of delay makes the experience feel broken.
Soniox speech translation is built for this reality.
It combines native-speaker speech recognition, real-time streaming translation, and high-fidelity text-to-speech into one platform for production speech translation across 60+ languages and 3,600 language pairs.
A breakthrough in real-time speech translation
Translate before the sentence ends
Soniox streams translation while speech is still happening, so users see or hear meaning immediately.
3,600 language pairs
Translate between any supported languages across 60+ languages, not just English-centric workflows.
High quality across languages
High-quality translation across 60+ languages, including historically underserved languages.
Built on native-speaker STT accuracy
Accurate translation starts with accurate recognition across accents, multilingual speech, and language switching.
Handles names, numbers, and domain terms
Soniox preserves critical details, including names, phone numbers, emails, IDs, and domain-specific terminology.
{
"model": "stt-rt-v4",
"translation": {
"type": "one_way",
"target_language": "en"
}
}Transcription + translation through a single stream
Soniox speech-to-text translation is built into Soniox Speech-to-Text API. Soniox transcribes every spoken word and translates mid-sentence. Both arrive together in a single labeled token stream.
Turn it on by adding a translation config to your speech-to-text API request and translation will run on the same WebSocket.
Translate live speech to text or to spoken output
Use Soniox STT alone to stream translated text alongside the transcript, or combine STT and TTS to speak the translation in the target language. Both run on the same real-time pipeline.
Speech-to-text translation
Translate live speech into written text using the Soniox STT API. Enable real-time translation with a simple configuration change — Soniox streams transcripts and translated text as speech happens.
Use it for captions, subtitles, meeting translation, agent assist, accessibility tools, and multilingual transcription.
Speech-to-speech translation
Build full spoken translation by combining Soniox STT and Soniox TTS. Soniox recognizes speech, translates it in real time, and speaks the output in the target language with low latency.
Use it for live interpreters, bilingual voice agents, travel assistants, customer support, and real-time multilingual communication.
One-way or two-way translation
One-way translates every speaker into a single target language. Two-way runs a live bilingual conversation between two languages, so each side speaks naturally and hears the other in their own.
One-way translation
Translate speech from any supported language into target language.
Ideal for live captions, multilingual meetings, broadcasts, lectures, events, customer calls, and products where many speakers need to be understood in one language.
Two-way translation
Translate between two languages for live bilingual conversation.
Soniox supports real-time two-way translation between any two supported languages, so both sides can speak naturally and understand each other instantly.
Translate between all supported languages
Real-time speech translation across 3,600 language pairs including support for mixed-language speech and language switching in conversation.
Speech translation for global products
Voice agents
Build multilingual voice agents that understand users in one language and respond in another.
Use for support, sales, scheduling, healthcare, and global voice automation.
Live interpreters
Create real-time interpreter experiences for conversations, meetings, events, and business communication.
Bilingual conversation feels immediate instead of delayed.
Multilingual meetings
Translate live meetings across languages with captions, transcripts, summaries, and action items.
Support global teams without forcing everyone into English.
Customer support and contact centers
Translate live customer calls while preserving names, numbers, addresses, and verification codes.
Give agents and customers a smoother multilingual experience.
Captions and subtitles
Generate real-time translated captions for broadcasts, webinars, classrooms, and live streams.
Translate speech as it happens, without long caption delays.
Accessibility and communication tools
Build assistive products that help people understand speech across languages in real time.
Live captions, translated transcripts, and spoken translation.
Built on the Soniox speech AI platform
Soniox speech translation is powered by the same infrastructure behind Soniox STT and TTS.
Speech-to-Text
Native-speaker accuracy across 60+ languages, with support for multilingual speech, alphanumerics, speaker diarization, context.
Translation
Real-time streaming translation across 3,600 language pairs, built for high quality and low delay across all supported languages.
Text-to-speech
High-fidelity speech generation in 60+ languages, built for names, alphanumerics, language switching, and ultra-low-latency streaming.
Together, they create a complete real-time low-latency speech AI platform.
Simple, usage-based pricing
Start translating live audio streams from ~$0.18/hour.
Translation is already built into Soniox Speech-to-Text API. When turned on, it adds about ~$0.06/hour in output token costs.
Frequently asked questions
What is Soniox speech translation?
How many languages are supported?
What is the difference between speech-to-text and speech-to-speech translation?
What is the difference between one-way and two-way translation?
Does Soniox handle accents and multilingual speech?
Can Soniox handle names, numbers, and domain-specific terms?
What can I build with Soniox speech translation?
How fast is the translation?
Ready to get started?
Create an account instantly, or contact us to design a custom package for your business.
Build with APIDocumentation
Get up and running in minutes and spend your time building the product, not wrestling with the API.
Explore docsSee what you’ll pay
Pay only for what you use with our flexible pricing. Built to scale with you.
Pricing details