Concurrency limits
Live counts of active real-time requests alongside the configured concurrency limits for the project and the organization that owns it.
Overview
Soniox applies concurrency limits to real-time APIs to ensure stability and fair use. Each project and organization
has a limit on the number of Speech-to-Text WebSocket sessions and
Text-to-Speech WebSocket streams it can have open at the same time.
Active POST /tts REST requests count toward the same tts_concurrent limit.
To monitor live concurrency, open the project's Usage > Activity page in the Soniox Console - the dashboard shows real-time charts of concurrent requests against the configured limit for both the project and the organization.
The organization limit is checked first, then the project limit. A request is rejected as soon as one is at its cap, and the
429 response identifies which tier rejected it. See rate and usage limits
for the response shape.
What gets returned
The response has two top-level scopes, project and organization. Each scope contains:
current- live counts of active requests right now.limits- the configured cap for that scope, ornullwhen no cap is configured. When a value underproject.limitsisnull, the project has no cap of its own and only the organization limit applies.
Two services are tracked under each scope:
transcribe_concurrent- open Speech-to-Text WebSocket sessions.tts_concurrent- open Text-to-Speech WebSocket streams and active Text-to-Speech REST requests.
See GET /v1/concurrency-limits for the full schema and field types.
You can request higher limits in the Soniox Console.
Usage logs
Per-request record of every transcription or speech generation processed by Soniox - model, audio duration, tokens, cost, and an optional client_reference_id.
Integrations
Explore Soniox Speech-to-Text and Text-to-Speech integrations for real-time, multilingual voice applications. Connect Soniox with LiveKit, Pipecat, LangChain, Twilio, Vercel AI SDK, and more.