Realtime Client
Soniox Python SDK - Realtime Client Reference
RealtimeAPI
Entrypoint for realtime helpers on SonioxClient.
Constructor
Parameters
| Parameter | Type | Description |
|---|---|---|
client | SonioxClient | Soniox client instance. |
Returns
None
Properties
| Property | Type | Description |
|---|---|---|
stt | RealtimeSTTClient | Speech-to-text API namespace. |
tts | RealtimeTTSClient | Text-to-Speech API namespace |
AsyncRealtimeAPI
Entrypoint for async realtime helpers on AsyncSonioxClient.
Constructor
Parameters
| Parameter | Type | Description |
|---|---|---|
client | AsyncSonioxClient | Soniox client instance. |
Returns
None
Properties
| Property | Type | Description |
|---|---|---|
stt | AsyncRealtimeSTTClient | Speech-to-text API namespace. |
tts | AsyncRealtimeTTSClient | Text-to-Speech API namespace |
RealtimeSTTClient
Factory for creating synchronous realtime speech-to-text sessions.
This class validates credentials and prepares session configuration, but does not itself manage WebSocket connections.
Constructor
Create a realtime STT client bound to an existing API client.
Parameters
| Parameter | Type | Description |
|---|---|---|
client | SonioxClient | Parent Soniox client providing configuration and credentials. |
Returns
None
connect()
Create a new realtime STT session.
The returned session is not connected until entered as a context manager.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | RealtimeSTTConfig | Realtime transcription configuration. |
api_key | str | None | Optional API key override. If not provided, the client's default API key is used. |
Returns
RealtimeSTTSession
A new RealtimeSTTSession instance.
Raises
SonioxValidationErrorIf no API key is available.
RealtimeSTTSession
Synchronous WebSocket session for a single real-time speech-to-text stream.
This class manages the full lifecycle of a real-time transcription session: connecting to the WebSocket endpoint, streaming audio data, receiving events, and gracefully closing the stream. A session is stateful and represents exactly one streaming interaction with the Soniox realtime API.
Instances are designed to be used as context managers.
Constructor
Create a new realtime STT session.
This does not open a network connection. The WebSocket connection is established when entering the context manager.
Parameters
| Parameter | Type | Description |
|---|---|---|
url | str | WebSocket URL for the realtime transcription endpoint. |
config | RealtimeSTTConfig | Configuration describing the audio format and transcription behavior for this session. |
Returns
None
Properties
| Property | Type | Description |
|---|---|---|
config | RealtimeSTTConfig | Return the configuration used to initialize this session. |
paused | bool | Return True if the session is currently paused. |
last_message | RealtimeEvent | None | Return the most recently received realtime event, if any. |
close()
Gracefully close the realtime session.
Sends a final empty message to signal end-of-stream, then closes the WebSocket connection. Calling this method multiple times is safe.
Returns
None
send_byte_chunk()
Send a single chunk of raw audio bytes to the realtime stream.
The audio data must match the format declared in the session configuration (sample rate, channels, encoding).
Parameters
| Parameter | Type | Description |
|---|---|---|
chunk | bytes | Raw audio bytes to send. |
Returns
None
Raises
SonioxRealtimeErrorIf the session is not connected or the send operation fails.
send_bytes()
Send audio data to the realtime stream.
This method accepts either a single bytes object or an iterator yielding audio chunks. When an iterator is provided, a FINISH control message is sent automatically after all chunks have been transmitted.
Parameters
| Parameter | Type | Description |
|---|---|---|
chunks | bytes | Iterator[bytes] | Audio data as raw bytes or an iterator of byte chunks. |
finish | bool | Whether to send a finish signal after streaming completes. |
Returns
None
send_control_message()
Send a control message to the realtime session.
Control messages modify the state of the stream, such as signaling end-of-audio or requesting finalization.
Parameters
| Parameter | Type | Description |
|---|---|---|
control_type | RealtimeControlType | The type of control message to send. |
Returns
None
Raises
SonioxRealtimeErrorIf the session is not connected or the message cannot be sent.
finish()
Signal that no more audio will be sent for this session.
Returns
None
keep_alive()
Send a keep-alive message to prevent the session from timing out.
Returns
None
finalize()
Finalize all outstanding non-final tokens while keeping the session open.
Subsequent tokens will be delivered with is_final=True.
Returns
None
recv_bytes()
Receive a raw message from the WebSocket connection.
Returns
bytes
The received message as bytes. An empty bytes object indicates that the connection has been closed.
parse_event()
Parse a raw WebSocket message into a structured realtime event.
Parameters
| Parameter | Type | Description |
|---|---|---|
raw | str | bytes | Raw message payload received from the server. |
Returns
RealtimeEvent
A validated RealtimeEvent instance.
receive_event()
Receive and parse the next realtime event from the server.
Returns
RealtimeEvent | None
The next RealtimeEvent, or None if the connection has closed.
Raises
SonioxRealtimeErrorIf the session is not connected.
receive_events()
Yield realtime events as they are received from the server.
Iteration stops automatically when the connection is closed.
Returns
Iterator[RealtimeEvent]
handle_events()
Receive realtime events and dispatch them to a handler callback.
Parameters
| Parameter | Type | Description |
|---|---|---|
handler | Callable[[RealtimeEvent], None] | Callable invoked for each received RealtimeEvent. |
Returns
None
pause()
Pause the session, suppressing outgoing audio and starting a background keepalive thread.
While paused, calls to :meth:send_byte_chunk are silently dropped.
A background thread sends a keepalive message every
KEEP_ALIVE_INTERVAL_SEC seconds to prevent the server from
timing out the session.
Calling pause on an already-paused session is a no-op.
Returns
None
Raises
SonioxRealtimeErrorIf the session is not connected.
resume()
Resume a paused session, stopping the keepalive thread and allowing audio to be sent again.
Calling resume on a session that is not paused is a no-op.
Returns
None
Raises
SonioxRealtimeErrorIf the session is not connected.
RealtimeTTSClient
Factory for synchronous realtime Text-to-Speech connections and streams.
Constructor
Parameters
| Parameter | Type | Description |
|---|---|---|
client | SonioxClient | Soniox client instance. |
Returns
None
connect()
Create a single-stream realtime Text-to-Speech connection.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | RealtimeTTSConfig | Configuration options for this operation. |
api_key | str | None | API key used for authentication. |
Returns
RealtimeTTSConnection
connect_multi_stream()
Create a multiplexed realtime Text-to-Speech connection.
Returns
RealtimeTTSMultiplexedConnection
RealtimeTTSConnection
Synchronous WebSocket connection for one realtime Text-to-Speech stream.
Constructor
Parameters
| Parameter | Type | Description |
|---|---|---|
url | str | WebSocket URL for realtime transcription. |
config | RealtimeTTSConfig | Configuration options for this operation. |
Returns
None
Properties
| Property | Type | Description |
|---|---|---|
config | RealtimeTTSConfig | Configuration used to initialize this connection. |
paused | bool | Return True if the connection is currently paused. |
last_message | RealtimeTTSEvent | None | Most recently received realtime event, if any. |
close()
Close the realtime Text-to-Speech connection.
Returns
None
send_text_chunk()
Send one text chunk to the realtime stream.
Parameters
| Parameter | Type | Description |
|---|---|---|
text | str | Text chunk to generate into speech. |
text_end | bool | Whether this message marks the final text chunk for the stream. |
Returns
None
send_text_chunks()
Send text data to the realtime stream.
Parameters
| Parameter | Type | Description |
|---|---|---|
chunks | str | Iterator[str] | Audio chunks to stream to realtime transcription. |
text_end | bool | Whether this message marks the final text chunk for the stream. |
Returns
None
finish()
Signal that no more text will be sent for this stream.
Returns
None
cancel()
Cancel the realtime Text-to-Speech stream.
Returns
None
keep_alive()
Send a keep-alive message to prevent the session from timing out.
Returns
None
pause()
Pause outgoing text and start periodic keep-alive messages.
Returns
None
resume()
Resume outgoing text and stop periodic keep-alive messages.
Returns
None
recv_bytes()
Receive one raw websocket message payload as bytes.
Returns
bytes
parse_event()
Parse a raw websocket message into a realtime event.
Parameters
| Parameter | Type | Description |
|---|---|---|
raw | str | bytes | Raw event payload from the realtime API. |
Returns
RealtimeTTSEvent
receive_event()
Receive and parse the next realtime event.
Returns
RealtimeTTSEvent | None
receive_events()
Yield realtime events until the stream ends or closes.
Returns
Iterator[RealtimeTTSEvent]
receive_audio_chunks()
Yield decoded audio chunks from incoming realtime events.
Returns
Iterator[bytes]
RealtimeTTSMultiplexedConnection
Synchronous websocket connection that can host multiple Text-to-Speech streams.
Constructor
Parameters
| Parameter | Type | Description |
|---|---|---|
url | str | WebSocket URL for realtime transcription. |
api_key | str | API key used for authentication. |
Returns
None
Properties
| Property | Type | Description |
|---|---|---|
last_message | RealtimeTTSEvent | None | Most recently received realtime event, if any. |
paused | bool | Return True if the connection is currently paused. |
close()
Close the websocket and clear the stream state.
Returns
None
keep_alive()
Send a keep-alive message to prevent the session from timing out.
Returns
None
pause()
Pause outgoing text and start periodic keep-alive messages.
Returns
None
resume()
Resume outgoing text and stop periodic keep-alive messages.
Returns
None
open_stream()
Register and start a new stream on the shared websocket.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | RealtimeTTSConfig | Configuration options for this operation. |
Returns
RealtimeTTSStream
RealtimeTTSStream
Handle for one stream on a multiplexed realtime TTS connection.
Constructor
Parameters
| Parameter | Type | Description |
|---|---|---|
connection | RealtimeTTSMultiplexedConnection | Synchronous websocket connection that can host multiple Text-to-Speech streams. |
config | RealtimeTTSConfig | Configuration options for this operation. |
Returns
None
Properties
| Property | Type | Description |
|---|---|---|
config | RealtimeTTSConfig | Stream configuration. |
stream_id | str | Stream identifier. |
last_message | RealtimeTTSEvent | None | Most recently received event for this stream, if any. |
send_text_chunk()
Send one text chunk for this stream.
Parameters
| Parameter | Type | Description |
|---|---|---|
text | str | Text chunk to generate into speech. |
text_end | bool | Whether this message marks the final text chunk for the stream. |
Returns
None
send_text_chunks()
Send text chunks for this stream.
Parameters
| Parameter | Type | Description |
|---|---|---|
chunks | str | Iterator[str] | Audio chunks to stream to realtime transcription. |
text_end | bool | Whether this message marks the final text chunk for the stream. |
Returns
None
finish()
Signal that no more text will be sent for this stream.
Returns
None
cancel()
Cancel this stream.
Returns
None
keep_alive()
Send a keepalive message on the underlying shared connection.
Returns
None
pause()
Pause the underlying shared connection and start keepalive.
Returns
None
resume()
Resume the underlying shared connection and stop keepalive.
Returns
None
receive_event()
Receive the next event for this stream.
Returns
RealtimeTTSEvent | None
receive_events()
Yield events for this stream until it ends.
Returns
Iterator[RealtimeTTSEvent]
receive_audio_chunks()
Yield decoded audio chunks for this stream.
Returns
Iterator[bytes]
AsyncRealtimeSTTClient
Factory for creating asynchronous realtime speech-to-text sessions.
This class validates credentials and prepares session configuration, but does not itself manage WebSocket connections.
Constructor
Create a realtime STT client bound to an existing API client.
Parameters
| Parameter | Type | Description |
|---|---|---|
client | AsyncSonioxClient | Parent Soniox client providing configuration and credentials. |
Returns
None
connect()
Create a new realtime STT session.
The returned session is not connected until entered as an async context manager.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | RealtimeSTTConfig | Realtime transcription configuration. |
api_key | str | None | Optional API key override. If not provided, the client's default API key is used. |
Returns
AsyncRealtimeSTTSession
A new AsyncRealtimeSTTSession instance.
Raises
SonioxValidationErrorIf no API key is available.
AsyncRealtimeSTTSession
Asynchronous WebSocket session for a single real-time speech-to-text stream.
This class manages the full lifecycle of a real-time transcription session: connecting to the WebSocket endpoint, streaming audio data, receiving events, and gracefully closing the stream. A session is stateful and represents exactly one streaming interaction with the Soniox realtime API.
Instances are designed to be used as async context managers.
Constructor
Create a new realtime STT session.
This does not open a network connection. The WebSocket connection is established when entering the async context manager.
Parameters
| Parameter | Type | Description |
|---|---|---|
url | str | WebSocket URL for the realtime transcription endpoint. |
config | RealtimeSTTConfig | Configuration describing the audio format and transcription behavior for this session. |
Returns
None
Properties
| Property | Type | Description |
|---|---|---|
config | RealtimeSTTConfig | Return the configuration used to initialize this session. |
paused | bool | Return True if the session is currently paused. |
last_message | RealtimeEvent | None | Return the most recently received realtime event, if any. |
close()
Gracefully close the realtime session.
Sends a final empty message to signal end-of-stream, then closes the WebSocket connection. Calling this method multiple times is safe.
Returns
None
send_byte_chunk()
Send a single chunk of raw audio bytes to the realtime stream.
The audio data must match the format declared in the session configuration (sample rate, channels, encoding).
Parameters
| Parameter | Type | Description |
|---|---|---|
chunk | bytes | Raw audio bytes to send. |
Returns
None
Raises
SonioxRealtimeErrorIf the session is not connected or the send operation fails.
send_bytes()
Send audio data to the realtime stream.
This method accepts either a single bytes object or an iterator yielding audio chunks. When an iterator is provided, a FINISH control message is sent automatically after all chunks have been transmitted.
Parameters
| Parameter | Type | Description |
|---|---|---|
chunks | bytes | AsyncIterator[bytes] | Audio data as raw bytes or an iterator of byte chunks. |
finish | bool | Whether to send a finish signal after streaming completes. |
Returns
None
send_control_message()
Send a control message to the realtime session.
Control messages modify the state of the stream, such as signaling end-of-audio or requesting finalization.
Parameters
| Parameter | Type | Description |
|---|---|---|
control_type | RealtimeControlType | The type of control message to send. |
Returns
None
Raises
SonioxRealtimeErrorIf the session is not connected or the message cannot be sent.
finish()
Signal that no more audio will be sent for this session.
Returns
None
keep_alive()
Send a keep-alive message to prevent the session from timing out.
Returns
None
finalize()
Finalize all outstanding non-final tokens while keeping the session open.
Subsequent tokens will be delivered with is_final=True.
Returns
None
recv_bytes()
Receive a raw message from the WebSocket connection.
Returns
bytes
The received message as bytes. An empty bytes object indicates that the connection has been closed.
parse_event()
Parse a raw WebSocket message into a structured realtime event.
Parameters
| Parameter | Type | Description |
|---|---|---|
raw | str | bytes | Raw message payload received from the server. |
Returns
RealtimeEvent
A validated RealtimeEvent instance.
receive_event()
Receive and parse the next realtime event from the server.
Returns
RealtimeEvent | None
The next RealtimeEvent, or None if the connection has closed.
Raises
SonioxRealtimeErrorIf the session is not connected.
receive_events()
Yield realtime events as they are received from the server.
Iteration stops automatically when the connection is closed.
Returns
AsyncIterator[RealtimeEvent]
handle_events()
Receive realtime events and dispatch them to a handler callback.
Parameters
| Parameter | Type | Description |
|---|---|---|
handler | Callable[[RealtimeEvent], Awaitable[None]] | Callable invoked for each received RealtimeEvent. |
Returns
None
pause()
Pause the session, suppressing outgoing audio and starting a background keepalive task.
While paused, calls to :meth:send_byte_chunk are silently dropped.
A background task sends a keepalive message every
KEEP_ALIVE_INTERVAL_SEC seconds to prevent the server from
timing out the session.
Calling pause on an already-paused session is a no-op.
Returns
None
Raises
SonioxRealtimeErrorIf the session is not connected.
resume()
Resume a paused session, stopping the keepalive task and allowing audio to be sent again.
Calling resume on a session that is not paused is a no-op.
Returns
None
Raises
SonioxRealtimeErrorIf the session is not connected.
AsyncRealtimeTTSClient
Factory for asynchronous realtime Text-to-Speech connections and streams.
Constructor
Parameters
| Parameter | Type | Description |
|---|---|---|
client | AsyncSonioxClient | Soniox client instance. |
Returns
None
connect()
Create a single-stream realtime Text-to-Speech connection.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | RealtimeTTSConfig | Configuration options for this operation. |
api_key | str | None | API key used for authentication. |
Returns
AsyncRealtimeTTSConnection
connect_multi_stream()
Create a multiplexed realtime Text-to-Speech connection.
Returns
AsyncRealtimeTTSMultiplexedConnection
AsyncRealtimeTTSConnection
Asynchronous WebSocket connection for one realtime Text-to-Speech stream.
Constructor
Parameters
| Parameter | Type | Description |
|---|---|---|
url | str | WebSocket URL for realtime transcription. |
config | RealtimeTTSConfig | Configuration options for this operation. |
Returns
None
Properties
| Property | Type | Description |
|---|---|---|
config | RealtimeTTSConfig | Configuration used to initialize this connection. |
paused | bool | Return True if the connection is currently paused. |
last_message | RealtimeTTSEvent | None | Most recently received realtime event, if any. |
close()
Close the realtime Text-to-Speech connection.
Returns
None
send_text_chunk()
Send one text chunk to the realtime stream.
Parameters
| Parameter | Type | Description |
|---|---|---|
text | str | Text chunk to generate into speech. |
text_end | bool | Whether this message marks the final text chunk for the stream. |
Returns
None
send_text_chunks()
Send text data to the realtime stream.
Parameters
| Parameter | Type | Description |
|---|---|---|
chunks | str | AsyncIterator[str] | Audio chunks to stream to realtime transcription. |
text_end | bool | Whether this message marks the final text chunk for the stream. |
Returns
None
finish()
Signal that no more text will be sent for this stream.
Returns
None
cancel()
Cancel the realtime Text-to-Speech stream.
Returns
None
keep_alive()
Send a keep-alive message to prevent the session from timing out.
Returns
None
pause()
Pause outgoing text and start periodic keep-alive messages.
Returns
None
resume()
Resume outgoing text and stop periodic keep-alive messages.
Returns
None
recv_bytes()
Receive one raw websocket message payload as bytes.
Returns
bytes
parse_event()
Parse a raw websocket message into a realtime event.
Parameters
| Parameter | Type | Description |
|---|---|---|
raw | str | bytes | Raw event payload from the realtime API. |
Returns
RealtimeTTSEvent
receive_event()
Receive and parse the next realtime event.
Returns
RealtimeTTSEvent | None
receive_events()
Yield realtime events until the stream ends or closes.
Returns
AsyncIterator[RealtimeTTSEvent]
receive_audio_chunks()
Yield decoded audio chunks from incoming realtime events.
Returns
AsyncIterator[bytes]
handle_events()
Receive events and pass each one to handler.
Parameters
| Parameter | Type | Description |
|---|---|---|
handler | Callable[[RealtimeTTSEvent], Awaitable[None]] | Event payload received from the realtime Text-to-Speech websocket. |
Returns
None
AsyncRealtimeTTSMultiplexedConnection
Asynchronous websocket connection that can host multiple TTS streams.
Constructor
Parameters
| Parameter | Type | Description |
|---|---|---|
url | str | WebSocket URL for realtime transcription. |
api_key | str | API key used for authentication. |
Returns
None
Properties
| Property | Type | Description |
|---|---|---|
last_message | RealtimeTTSEvent | None | Most recently received realtime event, if any. |
paused | bool | Return True if the connection is currently paused. |
close()
Close the websocket and clear the stream state.
Returns
None
keep_alive()
Send a keep-alive message to prevent the session from timing out.
Returns
None
pause()
Pause outgoing text and start periodic keep-alive messages.
Returns
None
resume()
Resume outgoing text and stop periodic keep-alive messages.
Returns
None
open_stream()
Register and start a new stream on the shared websocket.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | RealtimeTTSConfig | Configuration options for this operation. |
Returns
AsyncRealtimeTTSStream
AsyncRealtimeTTSStream
Handle for one stream on a multiplexed realtime TTS connection.
Constructor
Parameters
| Parameter | Type | Description |
|---|---|---|
connection | AsyncRealtimeTTSMultiplexedConnection | Asynchronous websocket connection that can host multiple TTS streams. |
config | RealtimeTTSConfig | Configuration options for this operation. |
Returns
None
Properties
| Property | Type | Description |
|---|---|---|
config | RealtimeTTSConfig | Stream configuration. |
stream_id | str | Stream identifier. |
last_message | RealtimeTTSEvent | None | Most recently received event for this stream, if any. |
send_text_chunk()
Send one text chunk for this stream.
Parameters
| Parameter | Type | Description |
|---|---|---|
text | str | Text chunk to generate into speech. |
text_end | bool | Whether this message marks the final text chunk for the stream. |
Returns
None
send_text_chunks()
Send text chunks for this stream.
Parameters
| Parameter | Type | Description |
|---|---|---|
chunks | str | AsyncIterator[str] | Audio chunks to stream to realtime transcription. |
text_end | bool | Whether this message marks the final text chunk for the stream. |
Returns
None
finish()
Signal that no more text will be sent for this stream.
Returns
None
cancel()
Cancel this stream.
Returns
None
keep_alive()
Send a keepalive message on the underlying shared connection.
Returns
None
pause()
Pause the underlying shared connection and start keepalive.
Returns
None
resume()
Resume the underlying shared connection and stop keepalive.
Returns
None
receive_event()
Receive the next event for this stream.
Returns
RealtimeTTSEvent | None
receive_events()
Yield events for this stream until it ends.
Returns
AsyncIterator[RealtimeTTSEvent]
receive_audio_chunks()
Yield decoded audio chunks for this stream.
Returns
AsyncIterator[bytes]