Soniox
SDKs

Web SDK

Build speech-to-text and text-to-speech workflows in browser with real-time APIs.

Soniox Web SDK is the official JavaScript/TypeScript SDK for using the Soniox Real-time API and Text-to-Speech API directly in the browser. It lets you:

  • Capture audio from the user's microphone
  • Stream audio to Soniox in real time
  • Receive transcription and translation results instantly
  • Generate speech from text over HTTP or WebSocket

Quickstart

Install

Install via your preferred package manager:

npm install @soniox/client
yarn add @soniox/client
pnpm add @soniox/client
bun add @soniox/client

Set up your temporary API key endpoint

In client environment (browser, mobile app, React Native, etc.), you don't want to expose your API key to the client. For this reason, you can create a temporary API key endpoint on your server and use it to issue temporary API keys for the client.

For example, you can use our Node SDK to create a temporary API key endpoint.

import express from 'express';
import { SonioxNodeClient } from '@soniox/node';

const app = express();
const client = new SonioxNodeClient(); // reads SONIOX_API_KEY from env

// Create a temporary API key endpoint
app.post('/tmp-key', async (_req, res) => {
  try {
    const { api_key, expires_at } = await client.auth.createTemporaryKey({
      usage_type: 'transcribe_websocket',
      expires_in_seconds: 300, // 1..3600
    });

    res.json({ api_key, expires_at });
  } catch (err) {
    res.status(500).json({ error: err instanceof Error ? err.message : 'Failed to create temporary key' });
  }
});

app.listen(3000, () => {
  console.log('Server listening on http://localhost:3000');
});

Read more about our Node SDK and Temporary API keys

Create your first real-time session

import { SonioxClient } from "@soniox/client";

// Create a Soniox client
const client = new SonioxClient({
  // Pass a function that fetches a temporary API key (and optional region / URL overrides)
  // from your server for each new session.
  config: async () => {
    const res = await fetch("/tmp-key", { method: "POST" });
    const { api_key } = await res.json();
    return { api_key };
  },
});

// Create a recording session
const recording = client.realtime.record({ model: "stt-rt-v4" });

// Listen for transcription results
recording.on("result", (result) => {
  const text = result.tokens.map((t) => t.text).join("");
  if (text) console.log(text);
});

// Listen for errors
recording.on("error", (err) => console.error("Error:", err));

// Call this from your UI (e.g. a Stop button) to end gracefully and wait for final results.
async function stopRecording() {
  await recording.stop();
}

Learn more about Real-time transcription

Generate your first speech

See Real-time speech generation for an example server endpoint that issues a temporary key with usage_type: 'tts_rt'.

import { SonioxClient } from "@soniox/client";

const client = new SonioxClient({
  config: async () => {
    const res = await fetch("/tts-tmp-key");
    const { api_key } = await res.json();
    return { api_key }; // temporary key with usage_type: 'tts_rt'
  },
});

const stream = await client.realtime.tts({ voice: "Adrian", audio_format: "wav" });
stream.sendText("Hello from Soniox Web SDK text-to-speech.", { end: true });

const chunks: Uint8Array[] = [];
for await (const chunk of stream) chunks.push(chunk);

const blob = new Blob(chunks, { type: "audio/wav" });
await new Audio(URL.createObjectURL(blob)).play();

Next steps