Real-time API
Connection keepalive
Learn how connection keepalive works.
Overview
In real-time transcription, you may have periods of silence — for example when using client-side VAD (voice activity detection), during pauses in speech, or when you intentionally stop streaming audio.
To keep the session alive and preserve context, you must send a keepalive control message:
This prevents the WebSocket connection from timing out when no audio is being sent.
When to use
Send a keepalive message whenever:
- You only stream audio during speech (client-side VAD).
- You temporarily pause audio streaming but want to keep the session active.
This ensures that:
- The connection stays open.
- Session context (e.g., speaker labels, language tracking, prompt) is preserved.
Key points
-
Send at least once every 20 seconds when not sending audio.
-
You may send more frequently (every 5–10s is common).
-
If no keepalive or audio is received for >20s, the connection may be closed.
-
You are charged for the full stream duration, not just the audio processed.