WebSocket API
Learn how to use and integrate Soniox Text-to-Speech WebSocket API.
Overview
The Soniox WebSocket API provides real-time Text-to-Speech with
low latency over a persistent WebSocket connection. A single connection
can host up to 5 concurrent streams multiplexed by stream_id.
Ideal for voice agents, interactive assistants, and LLM-driven
applications where audio must start playing before the full text is
generated.
WebSocket endpoint
Connect to the API using:
Configuration
Before streaming text for a stream, send a configuration message on the WebSocket connection.
Send one config message per stream_id you want to start.
Parameters
api_keyRequiredstringYour Soniox API key. Create API keys in the Soniox Console. For client apps, generate a temporary API key from your server to keep secrets secure.
stream_idRequiredstringClient-generated stream identifier. Must be unique among active streams on the same WebSocket connection.
You may reuse a stream_id only after its previous stream is terminated.
languageRequiredstringLanguage code. See the list of supported languages and their ISO codes.
"en"client_reference_idstringOptional client-defined identifier recorded with this request in usage logs. Does not need to be unique. Ignored if the request authenticates with a temporary API key.
Text streaming
After sending the configuration message for a stream, send text messages for that same stream_id:
Final text chunk:
Ending the stream
A stream ends with a three-step handshake: the client sends a text message with text_end: true, the server sends the last audio payload with audio_end: true, then {"terminated": true}.
For the full lifecycle, including error completion and client-initiated cancellation see Stream termination.
Normal completion
A stream completes in this order:
- The client sends a text message with
text_end: truefor the targetstream_id.
- The server sends the last audio payload with
audio_end: true.
- The server sends a final stream event with
terminated: true.
What audio_end means
audio_end: true marks the last audio chunk for that stream. No more audio payloads will follow.
You should still keep the stream open and wait for the terminal terminated: true event.
What terminated means
terminated: true indicates the server has fully closed the stream and released all stream resources.
Only after terminated: true it is safe to:
- reuse the same
stream_id - stop tracking stream state
- consider the stream lifecycle complete
Treat the stream as complete only after you receive terminated: true.
Error completion
If an error occurs for a stream:
- The server sends an error response for that
stream_id.
- The server sends
{"terminated": true}for that samestream_id.
- The failed stream is removed, but the WebSocket connection stays open and other streams can continue.
Client-initiated cancellation
To cancel a stream, send a cancel message. The server finalizes the stream and does not send audio chunks.
Cancel request:
Finalization response:
Error handling
One failed stream does not close the whole WebSocket connection.
-
Stream-level runtime errors (inside a running stream):
- The server sends an error response for that
stream_id. - The server then sends
{"terminated": true}for that samestream_id. - Only that stream ends; other active streams continue.
- The server sends an error response for that
-
Validation/input errors (invalid start/text message, unknown stream, malformed stream message):
- The server sends an error response.
- The WebSocket message loop stays alive, so valid streams can continue.
-
Connection-level failures (WebSocket disconnect/read/write failure, forced shutdown):
- The WebSocket connection closes.
- All streams on that connection end.
Response
Server messages are JSON and include stream_id for stream-specific events.
Successful audio messages include audio (base64 chunk), and terminal messages include terminated.
Terminal stream message
Error response
If an error occurs, the server returns an error message:
error_codenumberHTTP status code of the error.
error_typestringStable, machine-readable identifier of the error. Branch on this, not on
error_message. See the Errors reference for the
full catalog and recovery steps.
error_messagestringHuman-readable description of the error.
more_infostringLink to the section on the Errors page describing
this error_type.
request_idstringUnique identifier of this request. Include it when contacting support@soniox.com; server logs are keyed on it.
For error scoping and isolation behavior when one WebSocket hosts multiple streams, see Streams.
For the full catalog of error_type values across all Soniox APIs, see the Errors reference.
Full list of possible error codes and messages
The request is malformed or contains invalid parameters. error_type is one of
invalid_request,
invalid_stream_state,
max_concurrent_streams_reached,
or model_not_available.
API key is too long (max length 250).Audio format is too long (max length 50).Expected a text message. Binary frames are not accepted on this endpoint.Invalid language '<language>' for model '<model>'.Invalid message format. Expected JSON matching a start, text, or keep-alive request.Invalid voice '<voice>' for model '<model>'.Language is required.Language is too long (max length 50).Maximum concurrent streams per connection (N) reached. Send a cancel message for one of your active streams to free a slot, or open a new WebSocket connection.Missing audio_formatMissing languageMissing modelMissing stream_idMissing voiceModel name is too long (max length 50).Stream <stream_id> has already been cancelled. Start a new stream to send more text.Stream <stream_id> has already received text_end and is closed for input. Start a new stream to send more text.Stream <stream_id> is already active on this connection. Choose a different stream_id, or cancel the existing stream first.Stream <stream_id> not found. Send a start message first.Stream ID is too long (max length 256).Text is too long (max length 5000).The 'cancel' field cannot be combined with 'text' or 'text_end'. Send 'cancel' on its own to stop a stream.The requested model is not available. See https://soniox.com/docs/tts/models for the list of supported TTS models.Voice is too long (max length 50).
Authentication is missing or incorrect. Ensure a valid API key is provided before retrying.
error_type: unauthenticated.
Incorrect API key provided. You can get an API key at https://console.soniox.comInvalid or expired temporary API key. Create a new temporary API key and retry. See https://soniox.com/docs/guides/temporary-api-keys for details.Missing API key. Provide API key as a header (i.e. Authorization: Bearer <SONIOX_API_KEY>). You can get an API key at https://console.soniox.comThe temporary API key cannot be used for this action. Each temporary API key is scoped to a specific `usage_type`; create a new key with the correct usage type.
The organization's balance or monthly usage limit has been reached.
error_type is one of
organization_balance_exhausted,
organization_monthly_budget_exhausted,
or project_monthly_budget_exhausted.
Organization balance exhausted. Please either add funds manually or enable autopay.Organization monthly budget exhausted. Please increase it.Project monthly budget exhausted. Please increase it.
The temporary API key in use was created with a max_session_duration_seconds cap,
and that duration has elapsed for the current session.
error_type: temp_api_key_session_expired.
Temporary API key session duration limit exceeded. Create a new temporary API key to start a new session.
A backend call exceeded its deadline before completing. Retry the request.
error_type: request_timeout.
A usage or rate limit has been exceeded. You may retry after a delay or request an increase in limits via the
Soniox Console.
error_type: limit_exceeded.
Concurrent requests limit for text-to-speech has been exceeded for your organization.Concurrent requests limit for text-to-speech has been exceeded for your project.Requests per minute limit for text-to-speech has been exceeded for your organization.Requests per minute limit for text-to-speech has been exceeded for your project.
An unexpected server-side error occurred. The request may be retried.
error_type: internal_error.
The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our support email support@soniox.com if you keep seeing this error.
The service cannot accept the request right now (upstream overload, cache exhausted, shutdown).
Retry with backoff. The numeric (code N) in the message identifies the sub-cause for support triage.
error_type: service_unavailable.
Cannot continue request (code N). Please restart the request. Refer to: https://soniox.com/url/cannot-continue-request
Code example
Prerequisite: Complete the steps in Get started.
See on GitHub: soniox_sdk_realtime.py.
See on GitHub: soniox_sdk_realtime.js.
See on GitHub: soniox_realtime.py.
See on GitHub: soniox_realtime.js.