Concurrency limits

Overview

Soniox applies concurrency limits to real-time APIs to ensure stability and fair use. Each project and organization has a limit on the number of Speech-to-Text WebSocket sessions and Text-to-Speech WebSocket streams it can have open at the same time. Active POST /tts REST requests count toward the same tts_concurrent limit.

To monitor live concurrency, open the project's Usage > Activity page in the Soniox Console - the dashboard shows real-time charts of concurrent requests against the configured limit for both the project and the organization.

The organization limit is checked first, then the project limit. A request is rejected as soon as one is at its cap, and the 429 response identifies which tier rejected it. See rate and usage limits for the response shape.

What gets returned

The response has two top-level scopes, project and organization. Each scope contains:

current - live counts of active requests right now.
limits - the configured cap for that scope, or null when no cap is configured. When a value under project.limits is null, the project has no cap of its own and only the organization limit applies.

Two services are tracked under each scope:

transcribe_concurrent - open Speech-to-Text WebSocket sessions.
tts_concurrent - open Text-to-Speech WebSocket streams and active Text-to-Speech REST requests.

{
  "project": {
    "current": { "transcribe_concurrent": 2, "tts_concurrent": 0 },
    "limits":  { "transcribe_concurrent": 4, "tts_concurrent": 1 }
  },
  "organization": {
    "current": { "transcribe_concurrent": 5, "tts_concurrent": 1 },
    "limits":  { "transcribe_concurrent": 10, "tts_concurrent": 2 }
  }
}

See GET /v1/concurrency-limits for the full schema and field types.

You can request higher limits in the Soniox Console.

Overview

What gets returned

On this page