Soniox
SDKsNode.jsText-to-Speech

REST speech generation with Node SDK

Generate speech from text with the Soniox Node SDK over HTTP

The Soniox Node SDK supports Text-to-Speech generation over HTTP with SonioxNodeClient. Use REST when you have the full text up front and don't need streaming from an LLM — the SDK can return audio bytes, stream them as an async iterable, or write the output directly to a file.

Use real-time speech generation when you want the lowest latency to first audio or when text arrives incrementally (for example, streamed from an LLM).

Quickstart

The shortest path is generateToFile — the SDK calls the REST API, streams the response, and writes the output for you.

import { SonioxNodeClient } from "@soniox/node";

// The API key is read from the SONIOX_API_KEY environment variable.
const client = new SonioxNodeClient();

const bytesWritten = await client.tts.generateToFile("hello.wav", {
  text: "Hello from the Soniox Node SDK text-to-speech example.",
  voice: "Adrian",
  model: "tts-rt-v1-preview",
  language: "en",
  audio_format: "wav",
});

console.log(`Wrote ${bytesWritten} bytes`);

Generate to bytes

Use client.tts.generate() when you want the full audio payload in memory — for custom storage, uploading to another service, or post-processing.

const audio = await client.tts.generate({
  text: "This response is generated in memory.",
  voice: "Adrian",
  model: "tts-rt-v1-preview",
  language: "en",
  audio_format: "wav",
});

console.log(`Received ${audio.byteLength} bytes`); // audio is a Uint8Array

Stream audio chunks

Use client.tts.generateStream() to receive the response as an async iterable of Uint8Array chunks. This lets you start processing or playing audio before the full payload has arrived.

import { createWriteStream } from "node:fs";

const output = createWriteStream("streamed.wav");

let totalBytes = 0;
for await (const chunk of client.tts.generateStream({
  text: "Streaming audio as it arrives from the server.",
  voice: "Adrian",
  model: "tts-rt-v1-preview",
  language: "en",
  audio_format: "wav",
})) {
  output.write(chunk);
  totalBytes += chunk.byteLength;
}

output.end();
console.log(`Received ${totalBytes} bytes`);

Write directly to a file

client.tts.generateToFile() accepts either a file path (string) or any WritableStream and returns the total number of bytes written. This is the simplest option for common server-side workflows.

const bytesWritten = await client.tts.generateToFile("hello.pcm", {
  text: "Hello from Soniox.",
  voice: "Adrian",
  model: "tts-rt-v1-preview",
  language: "en",
  audio_format: "pcm_s16le",
  sample_rate: 24000,
});

console.log(`Wrote ${bytesWritten} bytes`);

List available models

client.tts.listModels() returns the set of available TTS models with their supported voices.

const models = await client.tts.listModels();

for (const model of models) {
  const voices = model.voices.map((v) => v.id).join(", ");
  console.log(`${model.id} (${model.name}): ${voices}`);
}

Generation options

All three generator methods accept the same GenerateSpeechOptions shape:

OptionTypeDescription
textstringInput text to synthesize. Required.
voicestringVoice identifier (e.g. "Adrian"). Required.
modelstringTTS model. Default "tts-rt-v1-preview".
languagestringLanguage code. Default "en".
audio_formatTtsAudioFormatOutput audio format. Default "wav".
sample_ratenumberOutput sample rate in Hz. Required for raw PCM formats.
bitratenumberCodec bitrate in bps (for compressed formats).
signalAbortSignalOptional signal to cancel the request.

See Available models for the full list of TTS models, voices, and supported audio formats.

Cancel a request

Pass an AbortSignal to cancel a generation request — useful when the user changes their mind mid-request, or when you need a deadline.

import { SonioxHttpError } from "@soniox/node";

const controller = new AbortController();
setTimeout(() => controller.abort(), 5000); // abort after 5s

try {
  const audio = await client.tts.generate({
    text: "Some long text...",
    voice: "Adrian",
    signal: controller.signal,
  });
  console.log(`Received ${audio.byteLength} bytes`);
} catch (err) {
  if (err instanceof SonioxHttpError && err.code === "aborted") {
    console.log("Generation cancelled");
  } else {
    throw err;
  }
}

Error handling

REST TTS requests can throw the following errors:

ErrorWhen it's thrown
SonioxHttpErrorCovers all HTTP failures: non-2xx responses (code: 'http_error'), network failures (code: 'network_error'), timeouts (code: 'timeout'), aborted requests (code: 'aborted'), and parse errors (code: 'parse_error'). Inspect code, statusCode, message, and bodyText.
SonioxErrorBase class for all SDK errors (SonioxHttpError extends it). Catch this if you want a single branch for every Soniox-originated failure.
import { SonioxHttpError, SonioxError } from "@soniox/node";

try {
  await client.tts.generateToFile("out.wav", {
    text: "Hello!",
    voice: "Adrian",
  });
} catch (err) {
  if (err instanceof SonioxHttpError) {
    console.error(
      `HTTP ${err.statusCode ?? "n/a"} (${err.code}): ${err.message}`
    );
  } else if (err instanceof SonioxError) {
    console.error("Soniox SDK error:", err.message);
  } else {
    throw err;
  }
}

For raw HTTP integration details, see the TTS REST API reference.

Error handling limitations

Mid-stream errors reported via HTTP trailers (X-Tts-Error-Code, X-Tts-Error-Message) may not be surfaced by HTTP clients that ignore trailers, including browser fetch and the Soniox JS SDK. For guaranteed error delivery, use the realtime WebSocket TTS instead.

See also