Classes

SonioxClient

Main entry point for the Soniox client SDK.

Example

// Recommended: async config with region
const client = new SonioxClient({
  config: async () => {
    const res = await fetch('/api/soniox-config', { method: 'POST' });
    return await res.json(); // { api_key, region }
  },
});

// High-level: record from microphone
const recording = client.realtime.record({ model: 'stt-rt-v4' });
recording.on('result', (r) => console.log(r.tokens));
await recording.stop();

// Low-level: direct session access
const session = client.realtime.stt({ model: 'stt-rt-v4' }, { api_key: key });
await session.connect();

permissions

get permissions(): PermissionResolver | undefined;

Permission resolver, if configured. Returns undefined if no resolver was provided (SSR-safe).

Example

const mic = await client.permissions?.check('microphone');
if (mic?.status === 'denied') {
  showSettingsMessage();
}

Returns

PermissionResolver | undefined

Parameter	Type
`options`	`SonioxClientOptions`

Properties

Property	Type	Description
`realtime`	{ `record`: (`options`) => `Recording`; `stt`: (`config`, `options`) => `RealtimeSttSession`; `tts`: `ClientTtsFactory`; }	Real-time API namespace
`realtime.record`	(`options`) => `Recording`	Start a high-level recording session. Returns synchronously so callers can attach event listeners before any async work (key fetch, mic access, connection) begins.
`realtime.stt`	(`config`, `options`) => `RealtimeSttSession`	Create a low-level STT session. The WebSocket URL is derived from the client's `config` (respecting `region` / `base_domain` / `stt_ws_url`) when `config` is a plain object, or from `ws_base_url` on the legacy path. If `config` was passed as an async function, call `client.realtime.record()` instead, or pass `ws_base_url` explicitly to `SonioxClient`. Throws SonioxError if the WebSocket URL cannot be resolved synchronously (async-config client without `ws_base_url`).
`realtime.tts`	`ClientTtsFactory`	TTS factory — callable for single-stream, `.multiStream()` for multi-stream. Uses the client's config resolver to obtain credentials and TTS WebSocket URL. Examples `const stream = await client.realtime.tts({ model: 'tts-rt-v1', voice: 'Adrian', language: 'en', audio_format: 'wav', }); stream.sendText("Hello"); stream.finish(); for await (const chunk of stream) { process(chunk); }` `const conn = await client.realtime.tts.multiStream(); const s1 = await conn.stream({ model: 'tts-rt-v1', voice: 'Adrian', language: 'en', audio_format: 'wav', });`
`tts`	{ `generate`: `Promise`<`Uint8Array`<`ArrayBufferLike`>>; `generateStream`: `AsyncIterable`<`Uint8Array`<`ArrayBufferLike`>>; }	REST TTS API namespace. Example `const audio = await client.tts.generate({ text: 'Hello', voice: 'Adrian', language: 'en', });`
`tts.generate`	`Promise`<`Uint8Array`<`ArrayBufferLike`>>	-
`tts.generateStream`	`AsyncIterable`<`Uint8Array`<`ArrayBufferLike`>>	-

Recording

state

get state(): RecordingState;

Current recording state

Returns

RecordingState

cancel()

cancel(): void;

Immediately cancel recording without waiting for final results

Returns

void

Type Parameter
`E` extends keyof `RecordingEvents`

Type Parameter
`E` extends keyof `RecordingEvents`

Type Parameter
`E` extends keyof `RecordingEvents`

pause()

pause(): void;

Pause recording.

Pauses the audio source (stops microphone capture) and pauses the session (activates automatic keepalive to prevent server disconnect).

Returns

void

reconnect()

reconnect(): void;

Force a reconnection — tears down the current session and audio encoder, then establishes a new session via the standard reconnect flow (backoff, config re-resolution, buffer drain).

Use this to recover from stale connections after platform lifecycle events such as laptop sleep/wake (web visibilitychange) or app backgrounding (React Native AppState).

Requires auto_reconnect to be enabled. No-op when the recording is not in recording or paused state.

Returns

void

resume()

resume(): void;

Resume recording after pause.

Resumes the audio source and session. Audio capture and transmission continue from where they left off. If audio was buffered during a reconnect while paused, the buffer is drained now.

Returns

void

stop()

stop(): Promise<void>;

Gracefully stop recording

Stops the audio source and waits for the server to process all buffered audio and return final results.

Returns

Promise<void>

Promise that resolves when the server acknowledges completion

MicrophoneSource

Browser microphone audio source

Uses navigator.mediaDevices.getUserMedia to capture audio from the microphone and MediaRecorder to encode it into chunks.

Example

const source = new MicrophoneSource();
await source.start({
  onData: (chunk) => session.sendAudio(chunk),
  onError: (err) => console.error(err),
});
// Later:
source.stop();

Parameter	Type
`options`	`MicrophoneSourceOptions`

restart()

restart(): void;

Reinitialize the MediaRecorder on the existing stream so the next chunks contain a fresh container header (required after reconnecting to a new server session).

Returns

void

start()

start(handlers): Promise<void>;

Request microphone access and start recording

Parameters

Parameter	Type
`handlers`	`AudioSourceHandlers`

Returns

Promise<void>

Throws

AudioUnavailableError if getUserMedia or MediaRecorder is not supported

Throws

AudioPermissionError if microphone access is denied

Throws

AudioDeviceError if no microphone is found

stop()

stop(): void;

Stop recording and release all resources

Returns

void

Parameter	Type
`permission`	`"microphone"`

Parameter	Type
`permission`	`"microphone"`

AudioPermissionError

Thrown when microphone access is denied by the user or blocked by the browser.

Maps to getUserMedia NotAllowedError DOMException.

toJSON()

toJSON(): Record<string, unknown>;

Converts to a plain object for logging/serialization

Returns

Record<string, unknown>

Inherited from

SonioxError.toJSON

toString()

toString(): string;

Creates a human-readable string representation

Returns

string

Inherited from

SonioxError.toString

Properties

Property	Type	Description
`cause`	`unknown`	The underlying error that caused this error, if any.
`code`	\| `SonioxErrorCode` \| `string` & { }	Error code describing the type of error. Typed as `string` at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions.
`statusCode`	`number` \| `undefined`	HTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors).

AudioDeviceError

Thrown when no audio input device is found

Maps to getUserMedia NotFoundError DOMException.

toJSON()

toJSON(): Record<string, unknown>;

Converts to a plain object for logging/serialization

Returns

Record<string, unknown>

Inherited from

SonioxError.toJSON

toString()

toString(): string;

Creates a human-readable string representation

Returns

string

Inherited from

SonioxError.toString

Properties

Property	Type	Description
`cause`	`unknown`	The underlying error that caused this error, if any.
`code`	\| `SonioxErrorCode` \| `string` & { }	Error code describing the type of error. Typed as `string` at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions.
`statusCode`	`number` \| `undefined`	HTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors).

AudioUnavailableError

Thrown when audio capture is not supported in the current environment

For example, when getUserMedia or MediaRecorder is not available.

toJSON()

toJSON(): Record<string, unknown>;

Converts to a plain object for logging/serialization

Returns

Record<string, unknown>

Inherited from

SonioxError.toJSON

toString()

toString(): string;

Creates a human-readable string representation

Returns

string

Inherited from

SonioxError.toString

Properties

Property	Type	Description
`cause`	`unknown`	The underlying error that caused this error, if any.
`code`	\| `SonioxErrorCode` \| `string` & { }	Error code describing the type of error. Typed as `string` at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions.
`statusCode`	`number` \| `undefined`	HTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors).

The returned iterator's return() resets the internal iterator-attach flag and drops any buffered events, so consumers that exit for await early (via break etc.) stop accruing memory while the session keeps running.

Returns

AsyncIterator<RealtimeEvent>

close()

close(): void;

Close (cancel) the session immediately without waiting

Returns

void

connect()

connect(): Promise<void>;

Connect to the Soniox WebSocket API.

Returns

Promise<void>

Throws

AbortError If aborted

Throws

ConnectionError If connection fails

Throws

StateError If already connected

finalize()

finalize(options?): void;

Requests the server to finalize current transcription

Parameters

Parameter	Type
`options?`	{ `trailing_silence_ms?`: `number`; }
`options.trailing_silence_ms?`	`number`

Returns

void

finish()

finish(): Promise<void>;

Gracefully finish the session

Returns

Promise<void>

keepAlive()

keepAlive(): void;

Send a keepalive message

Returns

void

off()

off<E>(event, handler): this;

Remove an event handler

Type Parameters

Type Parameter
`E` extends keyof `SttSessionEvents`

Parameters

Parameter	Type
`event`	`E`
`handler`	`SttSessionEvents`[`E`]

Returns

this

on()

on<E>(event, handler): this;

Type Parameters

Type Parameter
`E` extends keyof `SttSessionEvents`

Parameters

Parameter	Type
`event`	`E`
`handler`	`SttSessionEvents`[`E`]

Returns

this

once()

once<E>(event, handler): this;

Type Parameters

Type Parameter
`E` extends keyof `SttSessionEvents`

Parameters

Parameter	Type
`event`	`E`
`handler`	`SttSessionEvents`[`E`]

Returns

this

pause()

pause(): void;

Pause audio transmission and starts automatic keepalive messages

Returns

void

resume()

resume(): void;

Resume audio transmission

Returns

void

sendAudio()

sendAudio(data): void;

Send audio data to the server

Parameters

Parameter	Type	Description
`data`	`AudioData`	Audio data as Uint8Array or ArrayBuffer

Returns

void

Throws

AbortError If aborted

Throws

StateError If not connected

sendStream()

sendStream(stream, options?): Promise<void>;

Stream audio data from an async iterable source.

Parameters

Parameter	Type	Description
`stream`	`AsyncIterable`<`AudioData`>	Async iterable yielding audio chunks
`options?`	`SendStreamOptions`	Optional pacing and auto-finish settings

Returns

Promise<void>

Throws

AbortError If aborted during streaming

Throws

StateError If not connected

RealtimeTtsConnection

WebSocket connection for real-time Text-to-Speech.

Supports up to 5 concurrent streams multiplexed by stream_id. The connection automatically sends keepalive messages while open.

Example

const conn = new RealtimeTtsConnection(apiKey, wsUrl, ttsDefaults);
await conn.connect();

const s1 = conn.stream({ model, voice, language, audio_format });
s1.sendText("Hello");
s1.finish();
for await (const chunk of s1) { ... }

conn.close();

Extends

TypedEmitter<TtsConnectionEvents>

isConnected

get isConnected(): boolean;

Whether the WebSocket is connected.

Returns

boolean

Constructor

new RealtimeTtsConnection(
   apiKey, 
   wsUrl, 
   ttsDefaults?, 
   options?): RealtimeTtsConnection;

Parameters

Parameter	Type
`apiKey`	`string`
`wsUrl`	`string`
`ttsDefaults?`	`Partial`<`TtsStreamConfig`>
`options?`	`TtsConnectionOptions`

Returns

RealtimeTtsConnection

Overrides

TypedEmitter<TtsConnectionEvents>.constructor

close()

close(): void;

Close the WebSocket connection and terminate all active streams.

Returns

void

connect()

connect(): Promise<void>;

Open the WebSocket connection and start keepalive. Called automatically by stream if not yet connected.

Returns

Promise<void>

emit()

emit<E>(event, ...args): void;

Emit an event to all registered handlers. Handler errors do not prevent other handlers from running. Errors are reported to an error event if present, otherwise rethrown async.

Type Parameters

Type Parameter
`E` extends keyof `TtsConnectionEvents`

Parameters

Parameter	Type
`event`	`E`
...`args`	`Parameters`<`TtsConnectionEvents`[`E`]>

Returns

void

Inherited from

TypedEmitter.emit

off()

off<E>(event, handler): this;

Remove an event handler.

Type Parameters

Type Parameter
`E` extends keyof `TtsConnectionEvents`

Parameters

Parameter	Type
`event`	`E`
`handler`	`TtsConnectionEvents`[`E`]

Returns

this

Inherited from

TypedEmitter.off

on()

on<E>(event, handler): this;

Type Parameters

Type Parameter
`E` extends keyof `TtsConnectionEvents`

Parameters

Parameter	Type
`event`	`E`
`handler`	`TtsConnectionEvents`[`E`]

Returns

this

Inherited from

TypedEmitter.on

once()

once<E>(event, handler): this;

Type Parameters

Type Parameter
`E` extends keyof `TtsConnectionEvents`

Parameters

Parameter	Type
`event`	`E`
`handler`	`TtsConnectionEvents`[`E`]

Returns

this

Inherited from

TypedEmitter.once

removeAllListeners()

removeAllListeners(event?): void;

Remove all event handlers.

Parameters

Parameter	Type
`event?`	keyof TtsConnectionEvents

Returns

void

Inherited from

TypedEmitter.removeAllListeners

stream()

stream(input?): Promise<RealtimeTtsStream>;

Open a new TTS stream on this connection. Auto-connects if the WebSocket is not yet open.

Parameters

Parameter	Type	Description
`input?`	`TtsStreamInput`	Stream configuration (merged with tts_defaults)

Returns

Promise<RealtimeTtsStream>

A ready-to-use stream handle

RealtimeTtsStream

Handle for one TTS stream on a WebSocket connection.

Emits typed events and supports async iteration over decoded audio chunks.

Examples

stream.on('audio', (chunk) => process(chunk));
stream.on('terminated', () => console.log('done'));
stream.sendText("Hello world");
stream.finish();

stream.sendText("Hello world");
stream.finish();
for await (const chunk of stream) {
  process(chunk);
}

Extends

TypedEmitter<TtsStreamEvents>

state

get state(): TtsStreamState;

Current stream lifecycle state.

Returns

TtsStreamState

[asyncIterator]()

asyncIterator: AsyncIterator<Uint8Array<ArrayBufferLike>>;

Async iterator that yields decoded audio chunks.

The returned iterator's return() resets the internal iterator-attach flag and drops any buffered audio, so consumers that exit for await early (via break etc.) stop accruing memory while the stream keeps receiving server audio.

Returns

AsyncIterator<Uint8Array<ArrayBufferLike>>

cancel()

cancel(): void;

Cancel this stream. The server will stop generating and send terminated.

Returns

void

close()

close(): void;

Close this stream. For single-stream usage (created via tts(input)), also closes the underlying WebSocket connection.

Returns

void

emit()

emit<E>(event, ...args): void;

Emit an event to all registered handlers. Handler errors do not prevent other handlers from running. Errors are reported to an error event if present, otherwise rethrown async.

Type Parameters

Type Parameter
`E` extends keyof `TtsStreamEvents`

Parameters

Parameter	Type
`event`	`E`
...`args`	`Parameters`<`TtsStreamEvents`[`E`]>

Returns

void

Inherited from

TypedEmitter.emit

finish()

finish(): void;

Signal that no more text will be sent for this stream. The server will finish generating audio and send terminated.

Returns

void

off()

off<E>(event, handler): this;

Remove an event handler.

Type Parameters

Type Parameter
`E` extends keyof `TtsStreamEvents`

Parameters

Parameter	Type
`event`	`E`
`handler`	`TtsStreamEvents`[`E`]

Returns

this

Inherited from

TypedEmitter.off

on()

on<E>(event, handler): this;

Type Parameters

Type Parameter
`E` extends keyof `TtsStreamEvents`

Parameters

Parameter	Type
`event`	`E`
`handler`	`TtsStreamEvents`[`E`]

Returns

this

Inherited from

TypedEmitter.on

once()

once<E>(event, handler): this;

Type Parameters

Type Parameter
`E` extends keyof `TtsStreamEvents`

Parameters

Parameter	Type
`event`	`E`
`handler`	`TtsStreamEvents`[`E`]

Returns

this

Inherited from

TypedEmitter.once

removeAllListeners()

removeAllListeners(event?): void;

Remove all event handlers.

Parameters

Parameter	Type
`event?`	keyof TtsStreamEvents

Returns

void

Inherited from

TypedEmitter.removeAllListeners

sendStream()

sendStream(source): Promise<void>;

Pipe an async iterable of text chunks into the stream. Automatically calls finish when the iterable completes.

Designed for concurrent use: call sendStream() and consume audio via for await or events simultaneously.

Parameters

Parameter	Type
`source`	`AsyncIterable`<`string`>

Returns

Promise<void>

Example

stream.sendStream(llmTokenStream);
for await (const audio of stream) { forward(audio); }

sendText()

sendText(text, options?): void;

Send one text chunk to the TTS stream.

Parameters

Parameter	Type	Description
`text`	`string`	Text to synthesize
`options?`	{ `end?`: `boolean`; }	-
`options.end?`	`boolean`	If true, signals this is the final text chunk

Returns

void

Properties

Property	Type
`streamId`	`string`

SonioxError

Extends

Error

Extended by

Constructor

new SonioxError(
   message, 
   code?, 
   statusCode?, 
   cause?): SonioxError;

Parameters

Parameter	Type
`message`	`string`
`code?`	\| `SonioxErrorCode` \| `string` & { }
`statusCode?`	`number`
`cause?`	`unknown`

Returns

SonioxError

Overrides

Error.constructor

toJSON()

toJSON(): Record<string, unknown>;

Converts to a plain object for logging/serialization

Returns

Record<string, unknown>

toString()

toString(): string;

Creates a human-readable string representation

Returns

string

Properties

Property	Type	Description
`cause`	`unknown`	The underlying error that caused this error, if any.
`code`	\| `SonioxErrorCode` \| `string` & { }	Error code describing the type of error. Typed as `string` at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions.
`statusCode`	`number` \| `undefined`	HTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors).

SonioxHttpError

HTTP error class for all HTTP-related failures (REST API).

Thrown when HTTP requests fail due to network issues, timeouts, server errors, or response parsing failures.

Extends

SonioxError

Constructor

new SonioxHttpError(details): SonioxHttpError;

Parameters

Parameter	Type
`details`	`HttpErrorDetails`

Returns

SonioxHttpError

Overrides

SonioxError.constructor

toJSON()

toJSON(): Record<string, unknown>;

Converts to a plain object for logging/serialization

Returns

Record<string, unknown>

Overrides

SonioxError.toJSON

toString()

toString(): string;

Creates a human-readable string representation

Returns

string

Overrides

SonioxError.toString

Properties

Property	Type	Description
`bodyText`	`string` \| `undefined`	Response body text, capped at 4KB (only for http_error/parse_error)
`cause`	`unknown`	The underlying error that caused this error, if any.
`code`	`HttpErrorCode`	Categorized HTTP error code
`headers`	`Record`<`string`, `string`> \| `undefined`	Response headers (only for http_error)
`method`	`HttpMethod`	HTTP method
`statusCode`	`number` \| `undefined`	HTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors).
`url`	`string`	Request URL

TtsRestClient

Browser-safe REST client for TTS generation.

Provides generate() (buffered) and generateStream() (streaming) using only globalThis.fetch. HTTP failures are surfaced as SonioxHttpError, matching the rest of the Soniox SDK.

Authentication uses the Authorization: Bearer <api_key> header.

Example

const client = new TtsRestClient(apiKey, 'https://tts-rt.soniox.com');
const audio = await client.generate({ text: 'Hello', voice: 'Adrian' });

Constructor

new TtsRestClient(apiKey, ttsApiUrl): TtsRestClient;

Parameters

Parameter	Type
`apiKey`	`string`
`ttsApiUrl`	`string`

Returns

TtsRestClient

generate()

generate(options): Promise<Uint8Array<ArrayBufferLike>>;

Generate speech audio from text. Returns the full audio as a Uint8Array.

Parameters

Parameter	Type
`options`	`GenerateSpeechOptions`

Returns

Promise<Uint8Array<ArrayBufferLike>>

Throws

SonioxHttpError on non-2xx responses, network failures, or aborted requests.

generateStream()

generateStream(options): AsyncIterable<Uint8Array<ArrayBufferLike>>;

Generate speech audio from text as a streaming async iterable.

Yields Uint8Array chunks as they arrive from the server response body. Lower time-to-first-audio than generate.

Known limitation: Mid-stream server errors (reported via HTTP trailers) cannot be detected through the fetch API. The iterator may end early without an explicit error. Use WebSocket TTS for reliable error detection.

Parameters

Parameter	Type
`options`	`GenerateSpeechOptions`

Returns

AsyncIterable<Uint8Array<ArrayBufferLike>>

Throws

SonioxHttpError on non-2xx responses, network failures, or aborted requests (before the stream starts).

Classes

On this page