Thai speech-to-text
transcription and translation
Speech, (mis)understood.
Thai is spoken by over 70 million people worldwide — across Thailand, Laos, Myanmar, Cambodia, Malaysia, and beyond. For years, Thai speech-to-text has fallen short, failing at fundamentals like accurate and reliable recognition, multiple languages, and alphanumerics. It converted Thai audio into words, but the words lacked meaning and context.
Until now. Soniox reimagined everything Thai speech-to-text got wrong. You can speak naturally, switch languages mid-sentence, spell out codes and names, or ask for instant Thai translation, all in real-time. Soniox doesn’t just transcribe Thai speech – it understands it.
Get every word right, in Thai and every other language

Understand Thai speech with native-speaker fluency.
"We tried a dozen speech-to-text and translation services. Soniox is the best, so that's what we use."
Cayden Pierce,
CEO/CTO at Mentra
Mix languages, not mistakes.
"It’s the first model we’ve used that actually understands Hinglish. Switching mid-sentence just works."
Prakash N,
Co-Founder & Director at Tevatel


Every number, every code, every time.
"As the leading provider of voicebots for automotive dealerships in Germany, we’ve faced significant challenges recognizing license plates accurately. Soniox has solved this problem with exceptional recognition of alphanumeric sequences, resulting in a much higher acceptance rate for our voicebot."
Dr. Steven Zielke,
Founder & CEO of mobilApp
“It just gets the words right — any language, any accent, any context. That’s what accuracy is supposed to look like.”
Tony Wang,
Cofounder & Chief Revenue Officer at Agora
Real-time Thai, word by word
Transcribe Thai at the speed of speech.
"It's so fast, captions appear before people even finish talking. Zero lag. No buffering. Nothing."
Dag-Inge Aas,
Head of AI at Tana

Instantly translate Thai to any other – and vice versa.
"Live multilingual meetings finally sound natural — Soniox translates fluidly, in real-time."
VP of engineering,
Leading AI assistant company
Domain intelligence, built in

Speak any industry’s language.
"Soniox's ability to accurately transcribe complex medical terminology means our physician-customers spend significantly less time editing. This allows them to finalize their notes faster and focus on what matters most: patient care."
Max Malyk,
Vice President at DeliverHealth
Understand every conversation in context.
"It just gets the context — and when we add our own domain knowledge, it feels completely customized to us."
Mark Boyce,
CEO at MediLogix

Get every term right, every time.
Soniox delivers unmatched accuracy for specialized language, ensuring every technical term, brand name, or uncommon phrase is transcribed exactly as spoken. Simply provide your custom terms – from “SOFR” to “force majeure” – and Soniox captures them flawlessly in real-time.
Stay true in every translation.
Define exactly how key terms and phrases are translated – from medical terminology to brand names and idioms. Control whether “MRI” becomes “RM” or “Silicon Valley” stays the same, preserving both precision and meaning across languages.
Keep up with Conversational intelligence

Follow every voice in real-time.
"Soniox knows who’s speaking and when each thought ends. The real-time transcripts read like true dialogue, not data dumps."
Adam Strom,
Co-Founder & President at Mobius MD
Know when to call it quits.
Soniox goes beyond basic timing and silence detection — using advanced endpoint detection that reads tone, meaning, and conversational flow to know when someone is truly finished speaking. The result: smoother, faster, and more natural responses."Soniox gives us live transcriptions we can trust — fast, accurate, and natural. It’s why our users trust the experience and keep coming back."
Sidhant Bendre,
Co-Founder at Oleve
Global by default
One API. Every language.

Use Soniox anywhere Thai is spoken
Build Thai-speaking agents and assistants
Power fast and responsive voice agents that understand, transcribe, and respond to Thai.
Capture clinical conversations in Thai
Securely transcribe medical conversations and generate structured notes in Thai.
Generate live Thai captions and translations
Turn Thai podcasts, interviews, and videos into ready-to-publish subtitle files and live captions.
Transcribe and translate Thai on wearable devices
Power hands-free voice features on smartwatches, glasses, and fitness devices.
Built-in tools for Thai speech-to-text
Everything you need for real-time Thai transcription and translation - built right into the Soniox API.
Real-time streaming and async support
On the fly language detection
Speaker-aware diarization with punctuation
JSON formatted and production-ready
Translate between Thai and 60 other languages
See full API capabilities
Thai transcription with industry-leading accuracy
Never miss a word in Thai, even when it's fast, messy, accented, or hard to hear. That accuracy means fewer errors, better UX, and apps people can trust.
- Streams fluent, full-sentence output in real-time
- Handles regional Thai accents, noise, and overlapping speech
- Built to perform across real-world conditions
Don't take our word for it. Use your own Thai audio to compare Soniox against other providers live.
Soniox outperforms other providers for Thai accuracy:
| Provider | Thai WER |
|---|---|
| Soniox | 9.5% |
| OpenAI | 33.7% |
| 58.2% | |
| AWS | 14.2% |
| Azure | 15.5% |
| Deepgram | 48.6% |
| AssemblyAI | 85.6% |
| Speechmatics | 13.3% |
| ElevenLabs | 72.5% |
Frequently asked questions
1.How is Soniox different from other Thai speech APIs?arrow_downward
2.Is Soniox just for transcription, or can I use Soniox to translate Thai speech as well?arrow_downward
3.What types of apps can I build with Soniox?arrow_downward
4.What does the output look like?arrow_downward
5.How fast is Soniox?arrow_downward
6.How much does Soniox cost?arrow_downward
Get started with the Soniox API
Explore the docs
Find guides, API reference, and code samples to help you build fast.
docs_add_onView docsGo global with one API
Get production-ready speech-to-text recognition, transcription, and translation in 60+ languages.