Types

SonioxProviderProps

type SonioxProviderProps = {
  children: ReactNode;
} & SonioxProviderConfigProps | SonioxProviderClientProps;

Props for SonioxProvider.

Supply either a pre-built client instance or configuration props

Type Declaration

Name	Type
`children`	`ReactNode`

UnsupportedReason

type UnsupportedReason = "ssr" | "no-mediadevices" | "no-getusermedia" | "insecure-context";

Reason why the built-in browser MicrophoneSource is unavailable:

'ssr' — navigator is undefined (SSR, React Native, or other non-browser JS runtimes).
'no-mediadevices' — navigator exists but navigator.mediaDevices is missing.
'no-getusermedia' — navigator.mediaDevices exists but getUserMedia is not a function.
'insecure-context' — the page is not served over HTTPS.

This only reflects whether the default MicrophoneSource can work. Custom AudioSource implementations (e.g. for React Native) bypass this check entirely and can record regardless of this value.

AudioLevelProps

Extends

UseAudioLevelOptions

Properties

Property	Type	Description
`active?`	`boolean`	Whether volume metering is active. When false, resources are released.
`bands?`	`number`	Number of frequency bands to return. When set, the `bands` array is populated with per-band levels (0-1). Useful for spectrum/equalizer visualizations.
`children`	(`state`) => `ReactNode`	-
`fftSize?`	`number`	FFT size for the AnalyserNode. Must be a power of 2. Higher values give more frequency resolution (more bins per band) but update less frequently. Default `256`
`smoothing?`	`number`	Exponential smoothing factor (0-1). Higher = smoother/slower decay. Default `0.85`

AudioSupportResult

Properties

Property	Type
`isSupported`	`boolean`
`reason?`	`UnsupportedReason`

MicrophonePermissionState

Properties

Property	Type	Description
`canRequest`	`boolean`	Whether the permission can be requested (e.g., via a prompt).
`check`	() => `Promise`<`void`>	Check (or re-check) the microphone permission. No-op when unsupported.
`isDenied`	`boolean`	`status === 'denied'`.
`isGranted`	`boolean`	`status === 'granted'`.
`isSupported`	`boolean`	Whether permission checking is available.
`status`	`MicPermissionStatus`	Current permission status.

RecordingSnapshot

Immutable snapshot of the recording state exposed to React.

Extended by

UseRecordingReturn

Properties

Property	Type	Description
`error`	`Error` \| `null`	Latest error, if any.
`finalText`	`string`	Accumulated finalized text.
`finalTokens`	readonly `RealtimeToken`[]	All finalized tokens in chronological order. Useful for rendering per-token metadata (language, speaker, etc.) in the order tokens were spoken. Pair with `partialTokens` for the complete ordered stream.
`groups`	`Readonly`<`Record`<`string`, `TokenGroup`>>	Tokens grouped by the active `groupBy` strategy. Auto-populated when `translation` config is provided: - `one_way` → keys: `"original"`, `"translation"` - `two_way` → keys: language codes (e.g. `"en"`, `"es"`) Empty `{}` when no grouping is active.
`isActive`	`boolean`	`true` when state is not idle/stopped/canceled/error.
`isPaused`	`boolean`	`true` when `state === 'paused'`.
`isRecording`	`boolean`	`true` when `state === 'recording'`.
`isSourceMuted`	`boolean`	`true` when the audio source is muted externally (e.g. OS-level or hardware mute).
`partialText`	`string`	Text from current non-final tokens.
`partialTokens`	readonly `RealtimeToken`[]	Non-final tokens from the latest result.
`result`	`RealtimeResult` \| `null`	Latest raw result from the server.
`segments`	readonly `RealtimeSegment`[]	Accumulated final segments.
`state`	`RecordingState`	Current recording lifecycle state.
`text`	`string`	Full transcript: `finalText + partialText`.
`tokens`	readonly `RealtimeToken`[]	Tokens from the latest result message.
`utterances`	readonly `RealtimeUtterance`[]	Accumulated utterances (one per endpoint).

UseAudioLevelOptions

Extended by

AudioLevelProps

Properties

Property	Type	Description
`active?`	`boolean`	Whether volume metering is active. When false, resources are released.
`bands?`	`number`	Number of frequency bands to return. When set, the `bands` array is populated with per-band levels (0-1). Useful for spectrum/equalizer visualizations.
`fftSize?`	`number`	FFT size for the AnalyserNode. Must be a power of 2. Higher values give more frequency resolution (more bins per band) but update less frequently. Default `256`
`smoothing?`	`number`	Exponential smoothing factor (0-1). Higher = smoother/slower decay. Default `0.85`

UseAudioLevelReturn

Properties

Property	Type	Description
`bands`	readonly `number`[]	Per-band frequency levels, each 0-1. Empty array when the `bands` option is not set.
`volume`	`number`	Current volume level, 0 to 1. Updated every animation frame.

UseMicrophonePermissionOptions

Properties

Property	Type	Description
`autoCheck?`	`boolean`	Automatically check permission on mount.

UseRecordingConfig

Configuration for useRecording.

Extends the STT session config (model, language_hints, etc.) with recording-specific and React-specific options.

Can be used with or without a <SonioxProvider>:

With Provider: omit apiKey — the client is read from context.
Without Provider: pass apiKey directly — a client is created internally.

Extends

SttSessionConfig

Properties

Property	Type	Description
`apiKey?`	`ApiKeyConfig`	API key — string or async function that fetches a temporary key. Required when not using `<SonioxProvider>`.
`audio_format?`	`"auto"` \| `AudioFormat`	Audio format. Use 'auto' for automatic detection of container formats. For raw PCM formats, also set sample_rate and num_channels. Default `'auto'`
`buffer_queue_size?`	`number`	Maximum audio chunks to buffer during connection setup.
`client_reference_id?`	`string`	Optional tracking identifier (max 256 chars).
`context?`	`TranscriptionContext`	Additional context to improve transcription accuracy.
`enable_endpoint_detection?`	`boolean`	Enable endpoint detection for utterance boundaries. Useful for voice AI agents.
`enable_language_identification?`	`boolean`	Enable automatic language detection.
`enable_speaker_diarization?`	`boolean`	Enable speaker identification.
`groupBy?`	`"translation"` \| `"language"` \| `"speaker"` \| (`token`) => `string`	Group tokens by a key for easy splitting (e.g. translation, language, speaker). - `'translation'` — group by `translation_status`: keys `"original"` and `"translation"` - `'language'` — group by token `language` field: keys are language codes - `'speaker'` — group by token `speaker` field: keys are speaker identifiers - `(token) => string` — custom grouping function Auto-defaults when `translation` config is provided: - `one_way` → `'translation'` - `two_way` → `'language'`
`language_hints?`	`string`[]	Expected languages in the audio (ISO language codes).
`language_hints_strict?`	`boolean`	When true, recognition is strongly biased toward language hints. Best-effort only, not a hard guarantee.
`max_endpoint_delay_ms?`	`number`	Maximum delay between the end of speech and returned endpoint. Allowed values for maximum delay are between 500ms and 3000ms. The default value is 2000ms
`model`	`string`	Speech-to-text model to use.
`num_channels?`	`number`	Number of audio channels (required for raw audio formats).
`onConnected?`	() => `void`	Called when the WebSocket connects.
`onEndpoint?`	() => `void`	Called when an endpoint is detected.
`onError?`	(`error`) => `void`	Called when an error occurs.
`onFinished?`	() => `void`	Called when the recording session finishes.
`onResult?`	(`result`) => `void`	Called on each result from the server.
`onSourceMuted?`	() => `void`	Called when the audio source is muted externally (e.g. OS-level or hardware mute).
`onSourceUnmuted?`	() => `void`	Called when the audio source is unmuted after an external mute.
`onStateChange?`	(`update`) => `void`	Called on each state transition.
`permissions?`	`PermissionResolver` \| `null`	Permission resolver override (only used when `apiKey` is provided). Pass `null` to explicitly disable.
`resetOnStart?`	`boolean`	Reset transcript state when `start()` is called. Default `true`
`sample_rate?`	`number`	Sample rate in Hz (required for PCM formats).
`session_options?`	`SttSessionOptions`	SDK-level session options (signal, etc.).
`source?`	`AudioSource`	Custom audio source (bypasses default MicrophoneSource).
`translation?`	`TranslationConfig`	Translation configuration.
`wsBaseUrl?`	`string`	WebSocket URL override (only used when `apiKey` is provided).

UseRecordingReturn

Immutable snapshot of the recording state exposed to React.

Extends

RecordingSnapshot

Properties

Property	Type	Description
`cancel`	() => `void`	Immediately cancel — does not wait for final results.
`clearTranscript`	() => `void`	Clear transcript state (finalText, partialText, utterances, segments).
`error`	`Error` \| `null`	Latest error, if any.
`finalize`	(`options?`) => `void`	Request the server to finalize current non-final tokens.
`finalText`	`string`	Accumulated finalized text.
`finalTokens`	readonly `RealtimeToken`[]	All finalized tokens in chronological order. Useful for rendering per-token metadata (language, speaker, etc.) in the order tokens were spoken. Pair with `partialTokens` for the complete ordered stream.
`groups`	`Readonly`<`Record`<`string`, `TokenGroup`>>	Tokens grouped by the active `groupBy` strategy. Auto-populated when `translation` config is provided: - `one_way` → keys: `"original"`, `"translation"` - `two_way` → keys: language codes (e.g. `"en"`, `"es"`) Empty `{}` when no grouping is active.
`isActive`	`boolean`	`true` when state is not idle/stopped/canceled/error.
`isPaused`	`boolean`	`true` when `state === 'paused'`.
`isRecording`	`boolean`	`true` when `state === 'recording'`.
`isSourceMuted`	`boolean`	`true` when the audio source is muted externally (e.g. OS-level or hardware mute).
`isSupported`	`boolean`	Whether the built-in browser `MicrophoneSource` is available. Custom `AudioSource` implementations work regardless of this value.
`partialText`	`string`	Text from current non-final tokens.
`partialTokens`	readonly `RealtimeToken`[]	Non-final tokens from the latest result.
`pause`	() => `void`	Pause recording — pauses audio capture and activates keepalive.
`result`	`RealtimeResult` \| `null`	Latest raw result from the server.
`resume`	() => `void`	Resume recording after pause.
`segments`	readonly `RealtimeSegment`[]	Accumulated final segments.
`start`	() => `void`	Start a new recording. Aborts any in-flight recording first.
`state`	`RecordingState`	Current recording lifecycle state.
`stop`	() => `Promise`<`void`>	Gracefully stop — waits for final results from the server.
`text`	`string`	Full transcript: `finalText + partialText`.
`tokens`	readonly `RealtimeToken`[]	Tokens from the latest result message.
`unsupportedReason`	`UnsupportedReason` \| `undefined`	Why the built-in `MicrophoneSource` is unavailable, if applicable. Custom `AudioSource` implementations bypass this check entirely.
`utterances`	readonly `RealtimeUtterance`[]	Accumulated utterances (one per endpoint).

AudioLevel()

function AudioLevel(__namedParameters): ReactNode;

Parameters

Parameter	Type
`__namedParameters`	`AudioLevelProps`

Returns

ReactNode

SonioxProvider()

function SonioxProvider(props): ReactNode;

Parameters

Parameter	Type
`props`	`SonioxProviderProps`

Returns

ReactNode

checkAudioSupport()

function checkAudioSupport(): AudioSupportResult;

Check whether the current environment supports the built-in browser MicrophoneSource (which uses navigator.mediaDevices.getUserMedia).

This does not reflect general recording capability — custom AudioSource implementations (e.g. for React Native) bypass this check entirely and can record regardless of the result.

Returns

AudioSupportResult

Platform

browser

useAudioLevel()

function useAudioLevel(options?): UseAudioLevelReturn;

Parameters

Parameter	Type
`options?`	`UseAudioLevelOptions`

Returns

UseAudioLevelReturn

useMicrophonePermission()

function useMicrophonePermission(options?): MicrophonePermissionState;

Parameters

Parameter	Type
`options?`	`UseMicrophonePermissionOptions`

Returns

MicrophonePermissionState

useRecording()

function useRecording(config): UseRecordingReturn;

Parameters

Parameter	Type
`config`	`UseRecordingConfig`

Returns

UseRecordingReturn

useSoniox()

function useSoniox(): SonioxClient;

Returns the SonioxClient instance provided by the nearest SonioxProvider

Returns

SonioxClient

Throws

Error if called outside a SonioxProvider

On this page