Types
Soniox Client SDK — Types Reference
Type Alias: ApiKeyConfig
API key configuration.
string- A pre-fetched temporary API key (e.g., injected from SSR)() => Promise<string>- An async function that fetches a fresh temporary key from your backend. Called once per recording session.
Deprecated
Use SonioxConnectionConfig with SonioxClientOptions.config instead.
Example
Note: If you use Node.js, you can use the SonioxNodeClient to fetch a temporary API key via client.auth.createTemporaryKey().
AudioErrorCode
Error codes for audio-related errors
AudioSourceHandlers
Callbacks for receiving audio data and errors from an AudioSource.
Properties
| Property | Type | Description |
|---|---|---|
onData | (chunk) => void | Called when an audio chunk is available. |
onError | (error) => void | Called when a runtime error occurs during audio capture (after start). |
onMuted? | () => void | Called when the audio source is muted externally (e.g. OS-level or hardware mute). |
onUnmuted? | () => void | Called when the audio source is unmuted after an external mute. |
GenerateSpeechOptions
Options for REST TTS generation (generate / generateStream).
Properties
| Property | Type | Description |
|---|---|---|
audio_format? | string | Output audio format Default 'wav' |
bitrate? | number | Codec bitrate in bps (for compressed formats). |
language? | string | Language code. Default 'en' |
model? | string | Text-to-Speech model to use. Default 'tts-rt-v1-preview' |
sample_rate? | number | Output sample rate in Hz. Required for raw PCM formats. |
signal? | AbortSignal | Optional AbortSignal for cancellation. |
text | string | Input text to generate as speech. |
voice | string | Voice identifier. |
HttpErrorCode
Error codes for HTTP client errors
HttpMethod
HTTP methods supported by the client
MicrophoneSourceOptions
Options for MicrophoneSource
Properties
| Property | Type | Description |
|---|---|---|
constraints? | MediaTrackConstraints | MediaTrackConstraints for the audio track. Default { echoCancellation: false, noiseSuppression: false, autoGainControl: false, channelCount: 1, sampleRate: 16000 } |
recorderOptions? | MediaRecorderOptions | MediaRecorder options. See https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder/MediaRecorder |
timesliceMs? | number | Time interval in milliseconds between audio data chunks. Default 60 |
PermissionResult
Result of a permission check or request.
Properties
| Property | Type | Description |
|---|---|---|
can_request | boolean | Whether the user can be prompted again. false means permanently denied (e.g., browser "Block" or iOS settings). Useful for showing "go to settings" instructions. |
status | PermissionStatus | Current permission status. |
PermissionStatus
Unified permission status across all platforms.
PermissionType
Permission types supported by the resolver.
RecordOptions
Options for creating a recording
Type Declaration
| Name | Type | Description |
|---|---|---|
buffer_queue_size? | number | Maximum number of audio chunks to buffer while waiting for key/connection Default 1000 |
session_config()? | (resolved) => SttSessionConfig | Function that receives the resolved connection config (including stt_defaults from the server) and returns the final session config. When provided, its return value is used as the session config, and any flat session config fields on this object are ignored. Example client.realtime.record({ session_config: (resolved) => ({ ...resolved.stt_defaults, enable_endpoint_detection: true, }), }); |
session_options? | SttSessionOptions | SDK-level session options (signal, etc.) |
signal? | AbortSignal | AbortSignal for cancellation |
source? | AudioSource | Audio source to use. Defaults to MicrophoneSource if not provided. |
RecordingEvents
Events emitted by a Recording instance
Properties
| Property | Type | Description |
|---|---|---|
connected | () => void | WebSocket connected and ready. |
endpoint | () => void | Endpoint detected (speaker finished talking). |
error | (error) => void | Error occurred during recording. |
finalized | () => void | Finalization complete. |
finished | () => void | Recording finished (server acknowledged end of stream). |
reconnected | (event) => void | Successfully reconnected after a drop. |
reconnecting | (event) => void | About to attempt a reconnection. Call preventDefault() to cancel. |
result | (result) => void | Parsed result received from the server. |
session_restart | (event) => void | New STT session started (initial or after reconnect). Consumers should reset any session-local tracking state (e.g. token window comparisons). The reset_transcript flag indicates whether accumulated transcript state should also be cleared. |
source_muted | () => void | Audio source was muted externally (e.g. OS-level or hardware mute). |
source_unmuted | () => void | Audio source was unmuted after an external mute. |
state_change | (update) => void | Recording state transition. |
token | (token) => void | Individual token received. |
RecordingState
Unified recording lifecycle states.
SonioxClientOptions
Options for creating a SonioxClient instance.
Properties
| Property | Type | Description |
|---|---|---|
api_key? | ApiKeyConfig | API key configuration. - string - A pre-fetched temporary API key (e.g., injected from SSR) - () => Promise<string> - Async function that fetches a fresh key from your backend Deprecated Use config instead. |
buffer_queue_size? | number | Default maximum number of audio chunks to buffer while waiting for key/connection. Can be overridden per-recording. Default 1000 |
config? | | SonioxConnectionConfig | (context?) => Promise<SonioxConnectionConfig> | Connection configuration — sync object or async function. When provided as a function, it is called once per recording session, allowing you to fetch a fresh temporary API key and connection settings from your backend at runtime. Example // Sync config with region const client = new SonioxClient({ config: { api_key: tempKey, region: 'eu' }, }); // Async config (recommended for production) const client = new SonioxClient({ config: async () => { const res = await fetch('/api/soniox-config', { method: 'POST' }); return await res.json(); // { api_key, region, ... } }, }); |
default_session_options? | SttSessionOptions | Default session options applied to all sessions. Can be overridden per-recording. |
permissions? | PermissionResolver | Optional permission resolver for pre-flight microphone permission checks. Not set by default (SSR-safe, RN-safe). Example import { BrowserPermissionResolver } from '@soniox/client'; const client = new SonioxClient({ config: { api_key: tempKey }, permissions: new BrowserPermissionResolver(), }); |
ws_base_url? | string | WebSocket URL for real-time connections. Default 'wss://stt-rt.soniox.com/transcribe-websocket' Deprecated Use config.stt_ws_url or config.region instead. |
SttOptions
Options for creating a low-level STT session.
Properties
| Property | Type | Description |
|---|---|---|
api_key | string | Resolved API key string (temporary key). |
session_options? | SttSessionOptions | Session options (signal, etc.). |
TtsAudioFormat
Supported audio formats for Text-to-Speech output.
TtsConnectionEvents
Events emitted by a TTS WebSocket connection.
Properties
| Property | Type | Description |
|---|---|---|
close | () => void | The WebSocket connection was closed. |
error | (error) => void | A connection-level error occurred. Always a RealtimeError subclass (e.g. ConnectionError, NetworkError, AuthError). |
TtsConnectionOptions
Options for creating a TTS connection.
Properties
| Property | Type | Description |
|---|---|---|
connect_timeout_ms? | number | Maximum time to wait for the WebSocket connection to open (milliseconds). Default 20000 |
keepalive_interval_ms? | number | Interval for sending keepalive messages (milliseconds). Default 5000 Minimum 1000 |
TtsStreamConfig
Fully resolved TTS stream config sent over the WebSocket. All required fields are present after merging input with defaults.
Properties
| Property | Type |
|---|---|
audio_format | string |
bitrate? | number |
language | string |
model | string |
sample_rate? | number |
stream_id | string |
voice | string |
TtsStreamEvents
Events emitted by a TTS stream.
Properties
| Property | Type | Description |
|---|---|---|
audio | (chunk) => void | Decoded audio chunk received. |
audioEnd | () => void | Server marked the final audio payload for this stream. |
error | (error) => void | A stream-level error occurred. Always a RealtimeError subclass mapped from the server error_code / error_message. |
terminated | () => void | Stream has been fully terminated by the server. |
TtsStreamInput
Input for creating a TTS stream. All fields are optional and are merged
with tts_defaults from the resolved connection config. After merging,
model, language, voice, and audio_format must be present.
Properties
| Property | Type | Description |
|---|---|---|
audio_format? | TtsAudioFormat | Output audio format Example 'wav' |
bitrate? | number | Codec bitrate in bps (for compressed formats). |
language? | string | Language code for speech generation. Example 'en' |
model? | string | Text-to-Speech model to use. Example 'tts-rt-v1-preview' |
sample_rate? | number | Output sample rate in Hz. Required for raw PCM formats. |
stream_id? | string | Client-generated stream identifier. Must be unique among active streams on the same connection. Auto-generated if omitted. |
voice? | string | Voice identifier. Example 'Adrian' |
TtsStreamState
Lifecycle states for a TTS stream.
AudioSource
Platform-agnostic audio source interface.
Implementations must:
- Begin capturing audio in
start()and deliver chunks viahandlers.onData - Stop all capture and release resources in
stop() - Throw typed errors from
start()if capture cannot begin (e.g., permission denied)
Example
Methods
pause()?
Pause audio capture (optional). When paused, no data should be delivered via onData.
Returns
void
restart()?
Reinitialize the audio encoder without releasing the underlying capture device (optional).
Called during reconnection so the new server session receives a fresh audio stream with proper container headers. Implementations that produce a header-less format (e.g. raw PCM) can omit this.
Returns
void
resume()?
Resume audio capture after pause (optional).
Returns
void
start()
Start capturing audio.
Parameters
| Parameter | Type | Description |
|---|---|---|
handlers | AudioSourceHandlers | Callbacks for audio data and errors |
Returns
Promise<void>
Throws
AudioPermissionError if microphone access is denied
Throws
AudioDeviceError if no audio device is found
Throws
AudioUnavailableError if audio capture is not supported
stop()
Stop capturing audio and release all resources. Safe to call multiple times.
Returns
void
ClientTtsFactory()
Callable TTS factory with .multiStream() for multi-stream connections.
Callable TTS factory with .multiStream() for multi-stream connections.
Parameters
| Parameter | Type |
|---|---|
input? | TtsStreamInput |
Returns
Promise<RealtimeTtsStream>
Methods
multiStream()
Returns
Promise<RealtimeTtsConnection>
HttpErrorDetails
Error details for SonioxHttpError
Properties
| Property | Type | Description |
|---|---|---|
bodyText? | string | Response body text (capped at 4KB) |
cause? | unknown | - |
code | HttpErrorCode | - |
headers? | Record<string, string> | - |
message | string | - |
method | HttpMethod | - |
statusCode? | number | - |
url | string | - |
PermissionResolver
Platform-agnostic permission resolver.
Implementations handle platform-specific permission APIs:
- Browser:
navigator.permissions.query+getUserMedia - React Native:
expo-avorreact-native-permissions
Example
Methods
check()
Check current permission status WITHOUT prompting the user.
Parameters
| Parameter | Type |
|---|---|
permission | "microphone" |
Returns
Promise<PermissionResult>
request()
Request permission from the user (may show a system prompt). On platforms where status is already 'granted', this is a no-op.
Parameters
| Parameter | Type |
|---|---|
permission | "microphone" |
Returns
Promise<PermissionResult>
Function: resolveApiKey()
Resolves an ApiKeyConfig to a plain API key string.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | ApiKeyConfig | The API key configuration |
Returns
Promise<string>
The resolved API key string
Throws
If the function rejects or returns a non-string value
Deprecated
Use SonioxConnectionConfig with SonioxClientOptions.config instead.