Most advanced real-time translation

Translate live speech across 60+ languages and 3,600 language pairs with ultra-low latency, high quality, and true real-time streaming.

Trusted by teams building global voice products

Built for the hardest parts of speech translation

Speech translation breaks when systems wait too long, mistranscribe speech, or only work well for a few major languages.

Real conversations are messy. People speak with accents, switch languages, say names, addresses, emails, phone numbers, IDs, and domain-specific terms. In live conversation, every second of delay makes the experience feel broken.

Soniox speech translation is built for this reality. It combines native-speaker speech recognition, real-time streaming translation, and high-fidelity text-to-speech into one platform for production speech translation across 60+ languages and 3,600 language pairs.

A breakthrough in real-time speech translation

config.json
{
  "model": "stt-rt-v5",
  "translation": {
    "type": "one_way",
    "target_language": "en"
  }
}

Translate before the sentence ends

Soniox streams translation while speech is still happening, so users see or hear meaning immediately.

3,600 language pairs

Translate between any supported languages across 60+ languages, not just English-centric workflows.

High quality across languages

High-quality translation across 60+ languages, including historically underserved languages.

Built on native-speaker STT accuracy

Accurate translation starts with accurate recognition across accents, multilingual speech, and language switching.

Handles names, numbers, and domain terms

Soniox preserves critical details, including names, phone numbers, emails, IDs, and domain-specific terminology.

Translate into text or speech in real time

Use Soniox STT alone to stream translated text alongside the transcript, or combine STT and TTS to speak the translation in the target language. Both run on the same real-time pipeline.

Live Transcription
SpanishTranscript

EnglishTranslation

Speech-to-text translation

Translate live speech into written text using the Soniox STT API. Enable real-time translation with a simple configuration change — Soniox streams transcripts and translated text as speech happens.

Use it for captions, subtitles, meeting translation, agent assist, accessibility tools, and multilingual transcription.

Realtime Translator
HarutoJapanese
Speaking

EmmaEnglish

Speech-to-speech translation

Build full spoken translation by combining Soniox STT and Soniox TTS. Soniox recognizes speech, translates it in real time, and speaks the output in the target language with low latency.

Use it for live interpreters, bilingual voice agents, travel assistants, customer support, and real-time multilingual communication.

One-way or two-way translation

One-way translates every speaker into a single target language. Two-way runs a live bilingual conversation between two languages, so each side speaks naturally and hears the other in their own.

YouTube

One-way translation

Translate speech from any supported language into target language.

Ideal for live captions, multilingual meetings, broadcasts, lectures, events, customer calls, and products where many speakers need to be understood in one language.

Live Conversation
Sofía

Two-way translation

Translate between two languages for live bilingual conversation.

Soniox supports real-time two-way translation between any two supported languages, so both sides can speak naturally and understand each other instantly.

One platform for speech translation across 60+ languages and 3,600 language pairs, with real-time streaming and native-speaker accuracy.

Speech translation for global products

Voice agents

Build multilingual voice agents that understand users in one language and respond in another.

Use for support, sales, scheduling, healthcare, and global voice automation.

Live interpreters

Create real-time interpreter experiences for conversations, meetings, events, and business communication.

Bilingual conversation feels immediate instead of delayed.

Multilingual meetings

Translate live meetings across languages with captions, transcripts, summaries, and action items.

Support global teams without forcing everyone into English.

Customer support

Translate live customer calls while preserving names, numbers, addresses, and verification codes.

Give agents and customers a smoother multilingual experience.

Captions and subtitles

Generate real-time translated captions for broadcasts, webinars, classrooms, and live streams.

Translate speech as it happens, without long caption delays.

Accessibility tools

Build assistive products that help people understand speech across languages in real time.

Live captions, translated transcripts, and spoken translation.

Estimate your speech translation cost

Start translating live audio streams from ~$0.18/hour. Translation is built into Soniox Speech-to-Text API. Choose real-time or async translation and set your monthly volume below.

Pricing calculator

Stop overpaying for speech AI

Sonioxvs

1,000 hours of audio / month

1025501002505001k2.5k5k10k100k

Pricing assumptions

Based on public pay-as-you-go pricing. Enterprise discounts and committed-use contracts may differ. Some providers charge separately for certain features. The calculator uses the public price for the provider configuration that most closely matches Soniox.

Frequently asked questions

What is Soniox speech translation?
Soniox speech translation is a real-time platform that translates live speech across 60+ languages and 3,600 language pairs with ultra-low latency. It combines native-speaker speech recognition, real-time streaming translation, and high-fidelity text-to-speech into one platform.
How many languages are supported?
Soniox supports 60+ languages and 3,600 language pairs for translation between any supported languages.
What is the difference between speech-to-text and speech-to-speech translation?
Speech-to-text translation takes live speech and outputs a transcript plus real-time translated text, using the Soniox STT API. Speech-to-speech translation combines Soniox STT and Soniox TTS to take live speech and output translated speech in the target language with low latency.
What is the difference between one-way and two-way translation?
One-way translation translates speech from any supported language into one target language — ideal for live captions, multilingual meetings, broadcasts, lectures, and customer calls. Two-way translation translates between two languages for live bilingual conversation, so both sides can speak naturally and understand each other instantly.
Does Soniox handle accents and multilingual speech?
Yes. Soniox STT is built for native-speaker accuracy and handles accents, multilingual speech, and language switching across 60+ languages.
Can Soniox handle names, numbers, and domain-specific terms?
Yes. Soniox preserves the details that matter, including names, phone numbers, emails, IDs, addresses, verification codes, and domain-specific terminology.
What can I build with Soniox speech translation?
Common use cases include multilingual voice agents, live interpreters, multilingual meetings, customer support and contact centers, real-time translated captions and subtitles, and accessibility and communication tools.
How fast is the translation?
Soniox streams translation while speech is still happening, so users see or hear meaning immediately — translation arrives before the sentence ends, instead of waiting for long delays.

Ready to get started?

Create an account instantly, or contact us to design a custom package for your business.

Build with API

Documentation

Get up and running in minutes and spend your time building, not wrestling with the API.

Explore docs

See what you’ll pay

Pay only for what you use with our flexible pricing. Built to scale with you.

Pricing details