Chinese to English speech translation API

Stream Chinese (中文) speech and get English (English) back in real time. One WebSocket, ISO codes zh to en, and ultra-low latency for voice agents and live apps.

Trusted by

Production-ready Chinese to English translation API

Real Chinese speech includes accents, regional dialects, code switching, and domain-specific vocabulary. Soniox recognizes it in a single model and streams English while the speaker is still talking.

Chinese (Sino-Tibetan > Sinitic) and English (Indo-European > Germanic > West Germanic) come from different language families, so word order and morphology differ. Soniox reorders meaning in-stream instead of word by word.

Chinese is written in CJK and English in Latin, so Soniox emits correctly scripted English text.

A breakthrough in real-time Chinese to English translation

check

Translate before the sentence ends

English meaning lands as Chinese is spoken, not after the caption catches up.

check

Directional zh to en streaming

Set the source and target codes once. Both arrive in a single labeled token stream.

check

High quality English output

Same model across every language, including historically underserved ones.

check

Native-speaker Chinese STT accuracy

Accurate English translation starts with accurate Chinese recognition across accents and language switching.

check

Names, numbers, and domain terms

Preserved across the pair, including phone numbers, emails, and IDs.

config.json
{
  "model": "stt-rt-v4",
  "translation": {
    "type": "one_way",
    "source_language": "zh",
    "target_language": "en"
  }
}

Chinese and English through a single stream

Chinese to English translation is built on top of Soniox Speech-to-Text API. Every spoken word is transcribed, and English translation streams mid-sentence in the same labeled token stream.

Turn it on by adding a translation block with source_language: "zh" and target_language: "en". It runs on the same WebSocket and the same model, with no extra round trip.

Live Chinese to English: written and spoken

Chinese to English speech-to-text

micLive Chinese speecharrow_right_altsubjectEnglish text

Translate live Chinese into written English with the Soniox STT API. Soniox streams the Chinese transcript and the English translation as speech happens.

Use it for English captions, subtitles, meeting translation, agent assist, and multilingual transcription.

Chinese to English speech-to-speech

micLive Chinese speecharrow_right_altvolume_upSpoken English

Build full spoken Chinese to English translation by combining Soniox STT and Soniox TTS. Soniox recognizes Chinese, translates it, and speaks English with low latency.

Use it for live interpreters, bilingual voice agents, travel assistants, and customer support.

Live Chinese to English translation in action

Stream Chinese to English one-way to push all speech into English, or two-way to keep a bilingual conversation flowing between the two languages.

voice_selection
Chinese speaker says: 群里那个milk tea的deal快没了,deadline是12:30,你要不要一起拼单?
translate
Translated into English in real time.

One-way translation

Translate live Chinese into English. Everyone in the conversation sees the same translated stream.

Ideal for live captions, multilingual meetings, broadcasts, lectures, and customer calls.

voice_selection
Chinese speaker talks in Chinese.
hearing
English speaker hears English.
voice_selection
English speaker replies in English.
hearing
Chinese speaker hears Chinese.

Two-way translation

Translate between Chinese and English for live bilingual conversation. Each side speaks naturally and hears the other in their own language.

Soniox supports real-time two-way translation between any two of 60+ supported languages.

Accurate on both ends of the pair

Soniox transcribes Chinese at 6.6% word error rate and English at 6.5% word error rate. Accurate recognition on both sides is what makes the translation reliable.

speech_to_text

Speech-to-Text

Native-speaker accuracy across 60+ languages, with support for multilingual speech, alphanumerics, speaker diarization, context.

translate

Translation

Real-time streaming translation across 3,600 language pairs, built for high quality and low delay across all supported languages.

text_to_speech

Text-to-speech

High-fidelity speech generation in 60+ languages, built for names, alphanumerics, language switching, and ultra-low-latency streaming.

Together, they create a complete real-time low-latency speech AI platform.

quiz
About Chinese and English

Chinese has roughly 1,100,000,000 speakers across China, Taiwan, and Singapore. English has roughly 1,500,000,000 speakers across United States, United Kingdom, Canada, and Australia.

Chinese characters are one of the oldest continuously used writing systems, with over 3,000 years of history.

English is the most widely spoken language in the world when including both native and non-native speakers.

Soniox makes Chinese to English usable in real-time translation across every supported pair.

Frequently asked questions

How do I translate Chinese to English with the API?arrow_downward
Add a translation block to your real-time request with source_language "zh" and target_language "en". Soniox transcribes Chinese and streams the English translation over the same WebSocket.
Is Chinese to English translation real-time?arrow_downward
Yes. Soniox streams English while Chinese is still being spoken, so meaning arrives mid-sentence instead of after the sentence ends.
What about translating English to Chinese?arrow_downward
That direction is supported too. See the English to Chinese page, or use two-way translation to run both directions in one session.
Does Soniox handle CJK to Latin output?arrow_downward
Yes. Soniox outputs correctly scripted English text in Latin directly in the token stream.
Does Soniox handle Chinese dialects and accents?arrow_downward
Yes. Soniox handles Chinese dialects like Mandarin, Cantonese, and Wu in a single model, so English translation stays accurate across regions.
Which other providers support Chinese to English?arrow_downward
Based on their public docs, OpenAI, Google, Azure, and Speechmatics list both Chinese and English for real-time translation. Soniox is the only one that also supports two-way live translation across 60+ languages.
How fast is Chinese to English translation?arrow_downward
Soniox streams English as Chinese is being spoken, with ultra-low latency. Translation arrives before the sentence ends.

Ready to get started?

Create an account instantly, or contact us to design a custom package for your business.

Build with API arrow_right_alt

Documentation

Get up and running in minutes and spend your time building the product, not wrestling with the API.

Explore docs

See what you’ll pay

Pay only for what you use with our flexible pricing. Built to scale with you.

Pricing details