Classes
Soniox Client SDK — Class Reference
SonioxClient
Main entry point for the Soniox client SDK.
Example
permissions
Permission resolver, if configured.
Returns undefined if no resolver was provided (SSR-safe).
Example
Returns
PermissionResolver | undefined
Constructor
Parameters
| Parameter | Type |
|---|---|
options | SonioxClientOptions |
Returns
SonioxClient
Properties
| Property | Type | Description |
|---|---|---|
realtime | { record: (options) => Recording; stt: (config, options) => RealtimeSttSession; tts: ClientTtsFactory; } | Real-time API namespace |
realtime.record | (options) => Recording | Start a high-level recording session. Returns synchronously so callers can attach event listeners before any async work (key fetch, mic access, connection) begins. |
realtime.stt | (config, options) => RealtimeSttSession | Create a low-level STT session. The WebSocket URL is derived from the client's config (respecting region / base_domain / stt_ws_url) when config is a plain object, or from ws_base_url on the legacy path. If config was passed as an async function, call client.realtime.record() instead, or pass ws_base_url explicitly to SonioxClient. Throws SonioxError if the WebSocket URL cannot be resolved synchronously (async-config client without ws_base_url). |
realtime.tts | ClientTtsFactory | TTS factory — callable for single-stream, .multiStream() for multi-stream. Uses the client's config resolver to obtain credentials and TTS WebSocket URL. Examples const stream = await client.realtime.tts({ model: 'tts-rt-v1-preview', voice: 'Adrian', language: 'en', audio_format: 'wav', }); stream.sendText("Hello"); stream.finish(); for await (const chunk of stream) { process(chunk); } const conn = await client.realtime.tts.multiStream(); const s1 = await conn.stream({ model: 'tts-rt-v1-preview', voice: 'Adrian', language: 'en', audio_format: 'wav', }); |
tts | { generate: Promise<Uint8Array<ArrayBufferLike>>; generateStream: AsyncIterable<Uint8Array<ArrayBufferLike>>; } | REST TTS API namespace. Example const audio = await client.tts.generate({ text: 'Hello', voice: 'Adrian', language: 'en', }); |
tts.generate | Promise<Uint8Array<ArrayBufferLike>> | - |
tts.generateStream | AsyncIterable<Uint8Array<ArrayBufferLike>> | - |
Recording
state
Current recording state
Returns
cancel()
Immediately cancel recording without waiting for final results
Returns
void
finalize()
Request the server to finalize current non-final tokens.
Parameters
| Parameter | Type |
|---|---|
options? | { trailing_silence_ms?: number; } |
options.trailing_silence_ms? | number |
Returns
void
off()
Remove an event handler
Type Parameters
| Type Parameter |
|---|
E extends keyof RecordingEvents |
Parameters
| Parameter | Type |
|---|---|
event | E |
handler | RecordingEvents[E] |
Returns
this
on()
Register an event handler
Type Parameters
| Type Parameter |
|---|
E extends keyof RecordingEvents |
Parameters
| Parameter | Type |
|---|---|
event | E |
handler | RecordingEvents[E] |
Returns
this
once()
Register a one-time event handler
Type Parameters
| Type Parameter |
|---|
E extends keyof RecordingEvents |
Parameters
| Parameter | Type |
|---|---|
event | E |
handler | RecordingEvents[E] |
Returns
this
pause()
Pause recording.
Pauses the audio source (stops microphone capture) and pauses the session (activates automatic keepalive to prevent server disconnect).
Returns
void
reconnect()
Force a reconnection — tears down the current session and audio encoder, then establishes a new session via the standard reconnect flow (backoff, config re-resolution, buffer drain).
Use this to recover from stale connections after platform lifecycle
events such as laptop sleep/wake (web visibilitychange) or app
backgrounding (React Native AppState).
Requires auto_reconnect to be enabled. No-op when the recording
is not in recording or paused state.
Returns
void
resume()
Resume recording after pause.
Resumes the audio source and session. Audio capture and transmission continue from where they left off. If audio was buffered during a reconnect while paused, the buffer is drained now.
Returns
void
stop()
Gracefully stop recording
Stops the audio source and waits for the server to process all buffered audio and return final results.
Returns
Promise<void>
Promise that resolves when the server acknowledges completion
MicrophoneSource
Browser microphone audio source
Uses navigator.mediaDevices.getUserMedia to capture audio from the microphone
and MediaRecorder to encode it into chunks.
Example
Constructor
Parameters
| Parameter | Type |
|---|---|
options | MicrophoneSourceOptions |
Returns
MicrophoneSource
pause()
Pause audio capture
Returns
void
restart()
Reinitialize the MediaRecorder on the existing stream so the next chunks contain a fresh container header (required after reconnecting to a new server session).
Returns
void
resume()
Resume audio capture
Returns
void
start()
Request microphone access and start recording
Parameters
| Parameter | Type |
|---|---|
handlers | AudioSourceHandlers |
Returns
Promise<void>
Throws
AudioUnavailableError if getUserMedia or MediaRecorder is not supported
Throws
AudioPermissionError if microphone access is denied
Throws
AudioDeviceError if no microphone is found
stop()
Stop recording and release all resources
Returns
void
BrowserPermissionResolver
Browser permission resolver for checking and requesting microphone access.
Example
Constructor
Returns
BrowserPermissionResolver
check()
Check current microphone permission status without prompting the user.
Parameters
| Parameter | Type |
|---|---|
permission | "microphone" |
Returns
Promise<PermissionResult>
request()
Request microphone permission from the user. This may show a browser permission prompt.
Parameters
| Parameter | Type |
|---|---|
permission | "microphone" |
Returns
Promise<PermissionResult>
AudioPermissionError
Thrown when microphone access is denied by the user or blocked by the browser.
Maps to getUserMedia NotAllowedError DOMException.
Extends
toJSON()
Converts to a plain object for logging/serialization
Returns
Record<string, unknown>
Inherited from
toString()
Creates a human-readable string representation
Returns
string
Inherited from
Properties
| Property | Type | Description |
|---|---|---|
cause | unknown | The underlying error that caused this error, if any. |
code | | SonioxErrorCode | string & { } | Error code describing the type of error. Typed as string at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions. |
statusCode | number | undefined | HTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors). |
AudioDeviceError
Thrown when no audio input device is found
Maps to getUserMedia NotFoundError DOMException.
Extends
toJSON()
Converts to a plain object for logging/serialization
Returns
Record<string, unknown>
Inherited from
toString()
Creates a human-readable string representation
Returns
string
Inherited from
Properties
| Property | Type | Description |
|---|---|---|
cause | unknown | The underlying error that caused this error, if any. |
code | | SonioxErrorCode | string & { } | Error code describing the type of error. Typed as string at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions. |
statusCode | number | undefined | HTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors). |
AudioUnavailableError
Thrown when audio capture is not supported in the current environment
For example, when getUserMedia or MediaRecorder is not available.
Extends
toJSON()
Converts to a plain object for logging/serialization
Returns
Record<string, unknown>
Inherited from
toString()
Creates a human-readable string representation
Returns
string
Inherited from
Properties
| Property | Type | Description |
|---|---|---|
cause | unknown | The underlying error that caused this error, if any. |
code | | SonioxErrorCode | string & { } | Error code describing the type of error. Typed as string at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions. |
statusCode | number | undefined | HTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors). |
RealtimeTtsConnection
WebSocket connection for real-time Text-to-Speech.
Supports up to 5 concurrent streams multiplexed by stream_id.
The connection automatically sends keepalive messages while open.
Example
Extends
TypedEmitter<TtsConnectionEvents>
isConnected
Whether the WebSocket is connected.
Returns
boolean
Constructor
Parameters
| Parameter | Type |
|---|---|
apiKey | string |
wsUrl | string |
ttsDefaults? | Partial<TtsStreamConfig> |
options? | TtsConnectionOptions |
Returns
RealtimeTtsConnection
Overrides
close()
Close the WebSocket connection and terminate all active streams.
Returns
void
connect()
Open the WebSocket connection and start keepalive. Called automatically by stream if not yet connected.
Returns
Promise<void>
emit()
Emit an event to all registered handlers.
Handler errors do not prevent other handlers from running.
Errors are reported to an error event if present, otherwise rethrown async.
Type Parameters
| Type Parameter |
|---|
E extends keyof TtsConnectionEvents |
Parameters
| Parameter | Type |
|---|---|
event | E |
...args | Parameters<TtsConnectionEvents[E]> |
Returns
void
Inherited from
off()
Remove an event handler.
Type Parameters
| Type Parameter |
|---|
E extends keyof TtsConnectionEvents |
Parameters
| Parameter | Type |
|---|---|
event | E |
handler | TtsConnectionEvents[E] |
Returns
this
Inherited from
on()
Register an event handler.
Type Parameters
| Type Parameter |
|---|
E extends keyof TtsConnectionEvents |
Parameters
| Parameter | Type |
|---|---|
event | E |
handler | TtsConnectionEvents[E] |
Returns
this
Inherited from
once()
Register a one-time event handler.
Type Parameters
| Type Parameter |
|---|
E extends keyof TtsConnectionEvents |
Parameters
| Parameter | Type |
|---|---|
event | E |
handler | TtsConnectionEvents[E] |
Returns
this
Inherited from
removeAllListeners()
Remove all event handlers.
Parameters
| Parameter | Type |
|---|---|
event? | keyof TtsConnectionEvents |
Returns
void
Inherited from
stream()
Open a new TTS stream on this connection. Auto-connects if the WebSocket is not yet open.
Parameters
| Parameter | Type | Description |
|---|---|---|
input? | TtsStreamInput | Stream configuration (merged with tts_defaults) |
Returns
Promise<RealtimeTtsStream>
A ready-to-use stream handle
RealtimeTtsStream
Handle for one TTS stream on a WebSocket connection.
Emits typed events and supports async iteration over decoded audio chunks.
Examples
Extends
TypedEmitter<TtsStreamEvents>
state
Current stream lifecycle state.
Returns
[asyncIterator]()
Async iterator that yields decoded audio chunks.
Returns
AsyncIterator<Uint8Array<ArrayBufferLike>>
cancel()
Cancel this stream. The server will stop generating and send terminated.
Returns
void
close()
Close this stream. For single-stream usage (created via tts(input)),
also closes the underlying WebSocket connection.
Returns
void
emit()
Emit an event to all registered handlers.
Handler errors do not prevent other handlers from running.
Errors are reported to an error event if present, otherwise rethrown async.
Type Parameters
| Type Parameter |
|---|
E extends keyof TtsStreamEvents |
Parameters
| Parameter | Type |
|---|---|
event | E |
...args | Parameters<TtsStreamEvents[E]> |
Returns
void
Inherited from
finish()
Signal that no more text will be sent for this stream.
The server will finish generating audio and send terminated.
Returns
void
off()
Remove an event handler.
Type Parameters
| Type Parameter |
|---|
E extends keyof TtsStreamEvents |
Parameters
| Parameter | Type |
|---|---|
event | E |
handler | TtsStreamEvents[E] |
Returns
this
Inherited from
on()
Register an event handler.
Type Parameters
| Type Parameter |
|---|
E extends keyof TtsStreamEvents |
Parameters
| Parameter | Type |
|---|---|
event | E |
handler | TtsStreamEvents[E] |
Returns
this
Inherited from
once()
Register a one-time event handler.
Type Parameters
| Type Parameter |
|---|
E extends keyof TtsStreamEvents |
Parameters
| Parameter | Type |
|---|---|
event | E |
handler | TtsStreamEvents[E] |
Returns
this
Inherited from
removeAllListeners()
Remove all event handlers.
Parameters
| Parameter | Type |
|---|---|
event? | keyof TtsStreamEvents |
Returns
void
Inherited from
sendStream()
Pipe an async iterable of text chunks into the stream. Automatically calls finish when the iterable completes.
Designed for concurrent use: call sendStream() and consume audio
via for await or events simultaneously.
Parameters
| Parameter | Type |
|---|---|
source | AsyncIterable<string> |
Returns
Promise<void>
Example
sendText()
Send one text chunk to the TTS stream.
Parameters
| Parameter | Type | Description |
|---|---|---|
text | string | Text to synthesize |
options? | { end?: boolean; } | - |
options.end? | boolean | If true, signals this is the final text chunk |
Returns
void
Properties
| Property | Type |
|---|---|
streamId | string |
SonioxError
Extends
Error
Extended by
Constructor
Parameters
| Parameter | Type |
|---|---|
message | string |
code? | | SonioxErrorCode | string & { } |
statusCode? | number |
cause? | unknown |
Returns
SonioxError
Overrides
toJSON()
Converts to a plain object for logging/serialization
Returns
Record<string, unknown>
toString()
Creates a human-readable string representation
Returns
string
Properties
| Property | Type | Description |
|---|---|---|
cause | unknown | The underlying error that caused this error, if any. |
code | | SonioxErrorCode | string & { } | Error code describing the type of error. Typed as string at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions. |
statusCode | number | undefined | HTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors). |
SonioxHttpError
HTTP error class for all HTTP-related failures (REST API).
Thrown when HTTP requests fail due to network issues, timeouts, server errors, or response parsing failures.
Extends
Constructor
Parameters
| Parameter | Type |
|---|---|
details | HttpErrorDetails |
Returns
SonioxHttpError
Overrides
toJSON()
Converts to a plain object for logging/serialization
Returns
Record<string, unknown>
Overrides
toString()
Creates a human-readable string representation
Returns
string
Overrides
Properties
| Property | Type | Description |
|---|---|---|
bodyText | string | undefined | Response body text, capped at 4KB (only for http_error/parse_error) |
cause | unknown | The underlying error that caused this error, if any. |
code | HttpErrorCode | Categorized HTTP error code |
headers | Record<string, string> | undefined | Response headers (only for http_error) |
method | HttpMethod | HTTP method |
statusCode | number | undefined | HTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors). |
url | string | Request URL |
TtsRestClient
Browser-safe REST client for TTS generation.
Provides generate() (buffered) and generateStream() (streaming)
using only globalThis.fetch. HTTP failures are surfaced as
SonioxHttpError, matching the rest of the Soniox SDK.
Authentication uses the Authorization: Bearer <api_key> header.
Example
Constructor
Parameters
| Parameter | Type |
|---|---|
apiKey | string |
ttsApiUrl | string |
Returns
TtsRestClient
generate()
Generate speech audio from text. Returns the full audio as a Uint8Array.
Parameters
| Parameter | Type |
|---|---|
options | GenerateSpeechOptions |
Returns
Promise<Uint8Array<ArrayBufferLike>>
Throws
SonioxHttpError on non-2xx responses, network failures, or aborted requests.
generateStream()
Generate speech audio from text as a streaming async iterable.
Yields Uint8Array chunks as they arrive from the server response body.
Lower time-to-first-audio than generate.
Known limitation: Mid-stream server errors (reported via HTTP trailers)
cannot be detected through the fetch API. The iterator may end early
without an explicit error. Use WebSocket TTS for reliable error detection.
Parameters
| Parameter | Type |
|---|---|
options | GenerateSpeechOptions |
Returns
AsyncIterable<Uint8Array<ArrayBufferLike>>
Throws
SonioxHttpError on non-2xx responses, network failures, or aborted requests (before the stream starts).