REST speech generation with Web SDK
Generate speech from text in the browser with the Soniox Web SDK over HTTP
The Soniox Web SDK supports Text-to-Speech generation over HTTP with SonioxClient. Use REST when you have the full text up front — the SDK returns audio bytes that you can play, download, or hand to an <audio> element or MediaSource.
Use real-time speech generation when you want to narrate text as it arrives from an LLM or need the lowest latency to first audio.
Set up your temporary API key endpoint
In a browser environment you don't want to expose your primary API key. Create a temporary key endpoint on your server using the Soniox Node SDK and request a key with usage_type: 'tts_rt'.
Read more about Temporary API keys.
Quickstart
Create a SonioxClient with a config resolver that fetches a fresh temporary key, then call client.tts.generate() and play the resulting audio.
Generate to bytes
client.tts.generate() returns a Promise<Uint8Array> with the full audio payload. Use this when you want to upload the audio, store it locally, or decode it with the Web Audio API.
Stream audio chunks
client.tts.generateStream() returns an AsyncIterable<Uint8Array> so you can start playback before the full payload has arrived. Pair it with MediaSource for progressive playback.
For simpler use cases, you can concatenate chunks and play a single Blob:
Generation options
Both generate() and generateStream() accept the same GenerateSpeechOptions shape:
| Option | Type | Description |
|---|---|---|
text | string | Input text to synthesize. Required. |
voice | string | Voice identifier (e.g. "Adrian"). Required. |
model | string | TTS model. Default "tts-rt-v1-preview". |
language | string | Language code. Default "en". |
audio_format | TtsAudioFormat | Output audio format. Default "wav". |
sample_rate | number | Output sample rate in Hz. Required for raw PCM formats. |
bitrate | number | Codec bitrate in bps (for compressed formats). |
signal | AbortSignal | Optional signal to cancel the request. |
See Available models for the full list of TTS models, voices, and supported audio formats.
Cancel a request
Pass an AbortSignal to cancel a generation request — useful when the user clicks "stop" mid-playback.
Error handling
REST TTS requests can throw the following errors:
| Error | When it's thrown |
|---|---|
SonioxHttpError | Covers all HTTP failures: non-2xx responses (code: 'http_error'), network failures (code: 'network_error'), timeouts (code: 'timeout'), aborted requests (code: 'aborted'), and parse errors (code: 'parse_error'). Inspect code, statusCode, message, and bodyText. |
SonioxError | Base class for all SDK errors (SonioxHttpError extends it). Catch this if you want a single branch for every Soniox-originated failure. |
For raw HTTP integration details, see the TTS REST API reference.
Error handling limitations
Mid-stream errors reported via HTTP trailers (X-Tts-Error-Code, X-Tts-Error-Message) are not surfaced by browser fetch (and therefore by the Soniox Web SDK). For guaranteed error delivery, use the realtime WebSocket TTS instead.
Server-driven defaults
TTS defaults travel with your temporary API key. Return a tts_defaults object next to api_key from your key endpoint and the Web SDK will merge it as the base layer for every REST (and WebSocket) TTS call. Caller-provided fields on client.tts.generate(...) / generateStream(...) override the defaults.
See also
- Real-time speech generation — WebSocket-based streaming TTS.
GenerateSpeechOptionsreferenceSonioxClientreference