Connection keepalive
Learn how connection keepalive works.
Overview
In real-time transcription, you may have periods of silence — for example when
using
client-side VAD (voice activity detection), during pauses in speech, or
when you intentionally stop streaming audio.
To keep the session alive and preserve context, you must send a keepalive control message:
This prevents the WebSocket connection from timing out when no audio is being sent.
When to use
Send a keepalive message whenever:
- You only stream audio during speech (client-side VAD).
- You temporarily pause audio streaming but want to keep the session active.
This ensures that:
- The connection stays open.
- Session context (e.g., speaker labels, language tracking, prompt) is preserved.
Key points
- Send at least once every 20 seconds when not sending audio.
- You may send more frequently (every 5–10s is common).
- If no keepalive or audio is received for >20s, the connection may be closed.
- You are charged for the full stream duration, not just the audio processed.