Soniox

Classes

Soniox Client SDK — Class Reference

SonioxClient

Main entry point for the Soniox client SDK.

Example

// Recommended: async config with region
const client = new SonioxClient({
  config: async () => {
    const res = await fetch('/api/soniox-config', { method: 'POST' });
    return await res.json(); // { api_key, region }
  },
});

// High-level: record from microphone
const recording = client.realtime.record({ model: 'stt-rt-v4' });
recording.on('result', (r) => console.log(r.tokens));
await recording.stop();

// Low-level: direct session access
const session = client.realtime.stt({ model: 'stt-rt-v4' }, { api_key: key });
await session.connect();

permissions

get permissions(): PermissionResolver | undefined;

Permission resolver, if configured. Returns undefined if no resolver was provided (SSR-safe).

Example

const mic = await client.permissions?.check('microphone');
if (mic?.status === 'denied') {
  showSettingsMessage();
}

Returns

PermissionResolver | undefined

Constructor

new SonioxClient(options): SonioxClient;

Parameters

ParameterType
optionsSonioxClientOptions

Returns

SonioxClient

Properties

PropertyTypeDescription
realtime{ record: (options) => Recording; stt: (config, options) => RealtimeSttSession; tts: ClientTtsFactory; }Real-time API namespace
realtime.record(options) => RecordingStart a high-level recording session. Returns synchronously so callers can attach event listeners before any async work (key fetch, mic access, connection) begins.
realtime.stt(config, options) => RealtimeSttSessionCreate a low-level STT session. The WebSocket URL is derived from the client's config (respecting region / base_domain / stt_ws_url) when config is a plain object, or from ws_base_url on the legacy path. If config was passed as an async function, call client.realtime.record() instead, or pass ws_base_url explicitly to SonioxClient. Throws SonioxError if the WebSocket URL cannot be resolved synchronously (async-config client without ws_base_url).
realtime.ttsClientTtsFactoryTTS factory — callable for single-stream, .multiStream() for multi-stream. Uses the client's config resolver to obtain credentials and TTS WebSocket URL. Examples const stream = await client.realtime.tts({ model: 'tts-rt-v1-preview', voice: 'Adrian', language: 'en', audio_format: 'wav', }); stream.sendText("Hello"); stream.finish(); for await (const chunk of stream) { process(chunk); } const conn = await client.realtime.tts.multiStream(); const s1 = await conn.stream({ model: 'tts-rt-v1-preview', voice: 'Adrian', language: 'en', audio_format: 'wav', });
tts{ generate: Promise<Uint8Array<ArrayBufferLike>>; generateStream: AsyncIterable<Uint8Array<ArrayBufferLike>>; }REST TTS API namespace. Example const audio = await client.tts.generate({ text: 'Hello', voice: 'Adrian', language: 'en', });
tts.generatePromise<Uint8Array<ArrayBufferLike>>-
tts.generateStreamAsyncIterable<Uint8Array<ArrayBufferLike>>-

Recording

state

get state(): RecordingState;

Current recording state

Returns

RecordingState

cancel()

cancel(): void;

Immediately cancel recording without waiting for final results

Returns

void


finalize()

finalize(options?): void;

Request the server to finalize current non-final tokens.

Parameters

ParameterType
options?{ trailing_silence_ms?: number; }
options.trailing_silence_ms?number

Returns

void


off()

off<E>(event, handler): this;

Remove an event handler

Type Parameters

Type Parameter
E extends keyof RecordingEvents

Parameters

ParameterType
eventE
handlerRecordingEvents[E]

Returns

this


on()

on<E>(event, handler): this;

Register an event handler

Type Parameters

Type Parameter
E extends keyof RecordingEvents

Parameters

ParameterType
eventE
handlerRecordingEvents[E]

Returns

this


once()

once<E>(event, handler): this;

Register a one-time event handler

Type Parameters

Type Parameter
E extends keyof RecordingEvents

Parameters

ParameterType
eventE
handlerRecordingEvents[E]

Returns

this


pause()

pause(): void;

Pause recording.

Pauses the audio source (stops microphone capture) and pauses the session (activates automatic keepalive to prevent server disconnect).

Returns

void


reconnect()

reconnect(): void;

Force a reconnection — tears down the current session and audio encoder, then establishes a new session via the standard reconnect flow (backoff, config re-resolution, buffer drain).

Use this to recover from stale connections after platform lifecycle events such as laptop sleep/wake (web visibilitychange) or app backgrounding (React Native AppState).

Requires auto_reconnect to be enabled. No-op when the recording is not in recording or paused state.

Returns

void


resume()

resume(): void;

Resume recording after pause.

Resumes the audio source and session. Audio capture and transmission continue from where they left off. If audio was buffered during a reconnect while paused, the buffer is drained now.

Returns

void


stop()

stop(): Promise<void>;

Gracefully stop recording

Stops the audio source and waits for the server to process all buffered audio and return final results.

Returns

Promise<void>

Promise that resolves when the server acknowledges completion


MicrophoneSource

Browser microphone audio source

Uses navigator.mediaDevices.getUserMedia to capture audio from the microphone and MediaRecorder to encode it into chunks.

Example

const source = new MicrophoneSource();
await source.start({
  onData: (chunk) => session.sendAudio(chunk),
  onError: (err) => console.error(err),
});
// Later:
source.stop();

Constructor

new MicrophoneSource(options): MicrophoneSource;

Parameters

ParameterType
optionsMicrophoneSourceOptions

Returns

MicrophoneSource

pause()

pause(): void;

Pause audio capture

Returns

void


restart()

restart(): void;

Reinitialize the MediaRecorder on the existing stream so the next chunks contain a fresh container header (required after reconnecting to a new server session).

Returns

void


resume()

resume(): void;

Resume audio capture

Returns

void


start()

start(handlers): Promise<void>;

Request microphone access and start recording

Parameters

ParameterType
handlersAudioSourceHandlers

Returns

Promise<void>

Throws

AudioUnavailableError if getUserMedia or MediaRecorder is not supported

Throws

AudioPermissionError if microphone access is denied

Throws

AudioDeviceError if no microphone is found


stop()

stop(): void;

Stop recording and release all resources

Returns

void


BrowserPermissionResolver

Browser permission resolver for checking and requesting microphone access.

Example

const resolver = new BrowserPermissionResolver();
const mic = await resolver.check('microphone');
if (mic.status === 'prompt') {
  const result = await resolver.request('microphone');
  if (result.status === 'denied') {
    showDeniedMessage();
  }
}

Constructor

new BrowserPermissionResolver(): BrowserPermissionResolver;

Returns

BrowserPermissionResolver

check()

check(permission): Promise<PermissionResult>;

Check current microphone permission status without prompting the user.

Parameters

ParameterType
permission"microphone"

Returns

Promise<PermissionResult>


request()

request(permission): Promise<PermissionResult>;

Request microphone permission from the user. This may show a browser permission prompt.

Parameters

ParameterType
permission"microphone"

Returns

Promise<PermissionResult>


AudioPermissionError

Thrown when microphone access is denied by the user or blocked by the browser.

Maps to getUserMedia NotAllowedError DOMException.

Extends

toJSON()

toJSON(): Record<string, unknown>;

Converts to a plain object for logging/serialization

Returns

Record<string, unknown>

Inherited from

SonioxError.toJSON


toString()

toString(): string;

Creates a human-readable string representation

Returns

string

Inherited from

SonioxError.toString

Properties

PropertyTypeDescription
causeunknownThe underlying error that caused this error, if any.
code| SonioxErrorCode | string & { }Error code describing the type of error. Typed as string at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions.
statusCodenumber | undefinedHTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors).

AudioDeviceError

Thrown when no audio input device is found

Maps to getUserMedia NotFoundError DOMException.

Extends

toJSON()

toJSON(): Record<string, unknown>;

Converts to a plain object for logging/serialization

Returns

Record<string, unknown>

Inherited from

SonioxError.toJSON


toString()

toString(): string;

Creates a human-readable string representation

Returns

string

Inherited from

SonioxError.toString

Properties

PropertyTypeDescription
causeunknownThe underlying error that caused this error, if any.
code| SonioxErrorCode | string & { }Error code describing the type of error. Typed as string at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions.
statusCodenumber | undefinedHTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors).

AudioUnavailableError

Thrown when audio capture is not supported in the current environment

For example, when getUserMedia or MediaRecorder is not available.

Extends

toJSON()

toJSON(): Record<string, unknown>;

Converts to a plain object for logging/serialization

Returns

Record<string, unknown>

Inherited from

SonioxError.toJSON


toString()

toString(): string;

Creates a human-readable string representation

Returns

string

Inherited from

SonioxError.toString

Properties

PropertyTypeDescription
causeunknownThe underlying error that caused this error, if any.
code| SonioxErrorCode | string & { }Error code describing the type of error. Typed as string at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions.
statusCodenumber | undefinedHTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors).

RealtimeTtsConnection

WebSocket connection for real-time Text-to-Speech.

Supports up to 5 concurrent streams multiplexed by stream_id. The connection automatically sends keepalive messages while open.

Example

const conn = new RealtimeTtsConnection(apiKey, wsUrl, ttsDefaults);
await conn.connect();

const s1 = conn.stream({ model, voice, language, audio_format });
s1.sendText("Hello");
s1.finish();
for await (const chunk of s1) { ... }

conn.close();

Extends

isConnected

get isConnected(): boolean;

Whether the WebSocket is connected.

Returns

boolean

Constructor

new RealtimeTtsConnection(
   apiKey, 
   wsUrl, 
   ttsDefaults?, 
   options?): RealtimeTtsConnection;

Parameters

ParameterType
apiKeystring
wsUrlstring
ttsDefaults?Partial<TtsStreamConfig>
options?TtsConnectionOptions

Returns

RealtimeTtsConnection

Overrides

TypedEmitter<TtsConnectionEvents>.constructor

close()

close(): void;

Close the WebSocket connection and terminate all active streams.

Returns

void


connect()

connect(): Promise<void>;

Open the WebSocket connection and start keepalive. Called automatically by stream if not yet connected.

Returns

Promise<void>


emit()

emit<E>(event, ...args): void;

Emit an event to all registered handlers. Handler errors do not prevent other handlers from running. Errors are reported to an error event if present, otherwise rethrown async.

Type Parameters

Type Parameter
E extends keyof TtsConnectionEvents

Parameters

ParameterType
eventE
...argsParameters<TtsConnectionEvents[E]>

Returns

void

Inherited from

TypedEmitter.emit

off()

off<E>(event, handler): this;

Remove an event handler.

Type Parameters

Type Parameter
E extends keyof TtsConnectionEvents

Parameters

ParameterType
eventE
handlerTtsConnectionEvents[E]

Returns

this

Inherited from

TypedEmitter.off

on()

on<E>(event, handler): this;

Register an event handler.

Type Parameters

Type Parameter
E extends keyof TtsConnectionEvents

Parameters

ParameterType
eventE
handlerTtsConnectionEvents[E]

Returns

this

Inherited from

TypedEmitter.on

once()

once<E>(event, handler): this;

Register a one-time event handler.

Type Parameters

Type Parameter
E extends keyof TtsConnectionEvents

Parameters

ParameterType
eventE
handlerTtsConnectionEvents[E]

Returns

this

Inherited from

TypedEmitter.once

removeAllListeners()

removeAllListeners(event?): void;

Remove all event handlers.

Parameters

ParameterType
event?keyof TtsConnectionEvents

Returns

void

Inherited from

TypedEmitter.removeAllListeners

stream()

stream(input?): Promise<RealtimeTtsStream>;

Open a new TTS stream on this connection. Auto-connects if the WebSocket is not yet open.

Parameters

ParameterTypeDescription
input?TtsStreamInputStream configuration (merged with tts_defaults)

Returns

Promise<RealtimeTtsStream>

A ready-to-use stream handle


RealtimeTtsStream

Handle for one TTS stream on a WebSocket connection.

Emits typed events and supports async iteration over decoded audio chunks.

Examples

stream.on('audio', (chunk) => process(chunk));
stream.on('terminated', () => console.log('done'));
stream.sendText("Hello world");
stream.finish();
stream.sendText("Hello world");
stream.finish();
for await (const chunk of stream) {
  process(chunk);
}

Extends

state

get state(): TtsStreamState;

Current stream lifecycle state.

Returns

TtsStreamState

[asyncIterator]()

asyncIterator: AsyncIterator<Uint8Array<ArrayBufferLike>>;

Async iterator that yields decoded audio chunks.

Returns

AsyncIterator<Uint8Array<ArrayBufferLike>>


cancel()

cancel(): void;

Cancel this stream. The server will stop generating and send terminated.

Returns

void


close()

close(): void;

Close this stream. For single-stream usage (created via tts(input)), also closes the underlying WebSocket connection.

Returns

void


emit()

emit<E>(event, ...args): void;

Emit an event to all registered handlers. Handler errors do not prevent other handlers from running. Errors are reported to an error event if present, otherwise rethrown async.

Type Parameters

Type Parameter
E extends keyof TtsStreamEvents

Parameters

ParameterType
eventE
...argsParameters<TtsStreamEvents[E]>

Returns

void

Inherited from

TypedEmitter.emit

finish()

finish(): void;

Signal that no more text will be sent for this stream. The server will finish generating audio and send terminated.

Returns

void


off()

off<E>(event, handler): this;

Remove an event handler.

Type Parameters

Type Parameter
E extends keyof TtsStreamEvents

Parameters

ParameterType
eventE
handlerTtsStreamEvents[E]

Returns

this

Inherited from

TypedEmitter.off

on()

on<E>(event, handler): this;

Register an event handler.

Type Parameters

Type Parameter
E extends keyof TtsStreamEvents

Parameters

ParameterType
eventE
handlerTtsStreamEvents[E]

Returns

this

Inherited from

TypedEmitter.on

once()

once<E>(event, handler): this;

Register a one-time event handler.

Type Parameters

Type Parameter
E extends keyof TtsStreamEvents

Parameters

ParameterType
eventE
handlerTtsStreamEvents[E]

Returns

this

Inherited from

TypedEmitter.once

removeAllListeners()

removeAllListeners(event?): void;

Remove all event handlers.

Parameters

ParameterType
event?keyof TtsStreamEvents

Returns

void

Inherited from

TypedEmitter.removeAllListeners

sendStream()

sendStream(source): Promise<void>;

Pipe an async iterable of text chunks into the stream. Automatically calls finish when the iterable completes.

Designed for concurrent use: call sendStream() and consume audio via for await or events simultaneously.

Parameters

ParameterType
sourceAsyncIterable<string>

Returns

Promise<void>

Example

stream.sendStream(llmTokenStream);
for await (const audio of stream) { forward(audio); }

sendText()

sendText(text, options?): void;

Send one text chunk to the TTS stream.

Parameters

ParameterTypeDescription
textstringText to synthesize
options?{ end?: boolean; }-
options.end?booleanIf true, signals this is the final text chunk

Returns

void

Properties

PropertyType
streamIdstring

SonioxError

Extends

  • Error

Extended by

Constructor

new SonioxError(
   message, 
   code?, 
   statusCode?, 
   cause?): SonioxError;

Parameters

ParameterType
messagestring
code?| SonioxErrorCode | string & { }
statusCode?number
cause?unknown

Returns

SonioxError

Overrides

Error.constructor

toJSON()

toJSON(): Record<string, unknown>;

Converts to a plain object for logging/serialization

Returns

Record<string, unknown>


toString()

toString(): string;

Creates a human-readable string representation

Returns

string

Properties

PropertyTypeDescription
causeunknownThe underlying error that caused this error, if any.
code| SonioxErrorCode | string & { }Error code describing the type of error. Typed as string at the base level to allow subclasses (e.g. HTTP errors) to use their own error code unions.
statusCodenumber | undefinedHTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors).

SonioxHttpError

HTTP error class for all HTTP-related failures (REST API).

Thrown when HTTP requests fail due to network issues, timeouts, server errors, or response parsing failures.

Extends

Constructor

new SonioxHttpError(details): SonioxHttpError;

Parameters

ParameterType
detailsHttpErrorDetails

Returns

SonioxHttpError

Overrides

SonioxError.constructor

toJSON()

toJSON(): Record<string, unknown>;

Converts to a plain object for logging/serialization

Returns

Record<string, unknown>

Overrides

SonioxError.toJSON


toString()

toString(): string;

Creates a human-readable string representation

Returns

string

Overrides

SonioxError.toString

Properties

PropertyTypeDescription
bodyTextstring | undefinedResponse body text, capped at 4KB (only for http_error/parse_error)
causeunknownThe underlying error that caused this error, if any.
codeHttpErrorCodeCategorized HTTP error code
headersRecord<string, string> | undefinedResponse headers (only for http_error)
methodHttpMethodHTTP method
statusCodenumber | undefinedHTTP status code when applicable (e.g., 401 for auth errors, 500 for server errors).
urlstringRequest URL

TtsRestClient

Browser-safe REST client for TTS generation.

Provides generate() (buffered) and generateStream() (streaming) using only globalThis.fetch. HTTP failures are surfaced as SonioxHttpError, matching the rest of the Soniox SDK.

Authentication uses the Authorization: Bearer <api_key> header.

Example

const client = new TtsRestClient(apiKey, 'https://tts-rt.soniox.com');
const audio = await client.generate({ text: 'Hello', voice: 'Adrian' });

Constructor

new TtsRestClient(apiKey, ttsApiUrl): TtsRestClient;

Parameters

ParameterType
apiKeystring
ttsApiUrlstring

Returns

TtsRestClient

generate()

generate(options): Promise<Uint8Array<ArrayBufferLike>>;

Generate speech audio from text. Returns the full audio as a Uint8Array.

Parameters

ParameterType
optionsGenerateSpeechOptions

Returns

Promise<Uint8Array<ArrayBufferLike>>

Throws

SonioxHttpError on non-2xx responses, network failures, or aborted requests.


generateStream()

generateStream(options): AsyncIterable<Uint8Array<ArrayBufferLike>>;

Generate speech audio from text as a streaming async iterable.

Yields Uint8Array chunks as they arrive from the server response body. Lower time-to-first-audio than generate.

Known limitation: Mid-stream server errors (reported via HTTP trailers) cannot be detected through the fetch API. The iterator may end early without an explicit error. Use WebSocket TTS for reliable error detection.

Parameters

ParameterType
optionsGenerateSpeechOptions

Returns

AsyncIterable<Uint8Array<ArrayBufferLike>>

Throws

SonioxHttpError on non-2xx responses, network failures, or aborted requests (before the stream starts).